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Final  Report  on  the  International  Workshop 
on  High-level  Language  Computer  Architecture 


Reported  by  Yaohan  Chu 
June  30,  1980 


This  is  the  final  report  for  the  International  Workshop  on  HLLCA. 

This  Workshop  is  made  possible  by  the  partial  support  from  the  ONR.  The 
details  of  the  Workshop  are  reported  below. 

1.  Summary  of  the  Grant 

Title:  International  Workshop  on  High-level  Language  Computer  Architecture 

Period:  7/1/79  -  6/30/80 

Grant  no.:  N00014-79-C-0604 

Grant  Amount:  $9,860.00 

Principal  Investigator:  Professor  Yaohan  Chu 

Department  of  Computer  Science 
University  of  Maryland 
College  Park,  MD  20742 
301-454-4245 

2.  Workshop 

Date: 

Location: 

No.  of  Registrants: 

Programs: 

Proceedings: 

3.  Organization 

The  workshop  is  organized  by  the  Workshop  Committee.  There  are 
four  members  on  the  Workshop  Committee;  the  names  are  shown  in  Appendix  C.^ 

The  technical  program  is  organized  by  the  Program  Committee  whose 
chairman  is  Dr.  Yaohan  Chu.  There  are  17  members;  the  names  of  these  members 
are  also  shown  in  Appendix  C.  There  are  26  papers  in  8  sessions  in  addition 
to  a  pannel  discussion  session.  The  details  of  this  program  are  shown  in 
Appendix  C. 

The  tutorial  program  is  organized  by  Dr.  Keith  Doty.  There  are  5 
lecturers;  each  provides  a  set  of  notes.  The  names  of  the  lecturers  are 
shown  in  Appendix  C.  The  other  working  members  of  the  workshop  are  also 
shown  in  Appendix  C. 

The  Workshop  Committee  approved  the  travel  allownaces  for  4 
International  participants  who  presented  a  paper  as  a  minimum  requirement. 

These  names  are  shown  below. 


May  26-28,  1980 
Fort  Lauderdale,  FI. 
Technical  program: 
Tutorial  program: 

See  Appendix  C  — - - - 
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(1)  Professor  Yoong-Nien  Chen 
Department  of  Comptuers 
University  of  Science  and  Technology 
City  of  Hefei,  Province  of  Anhui, 

The  People’s  Republic  of  China 
Amount:  $1,000. 

(2)  Dr.  Masahiro  Yamamoto 
Central  Research  Laboratory 
Nippon  Electric  Company,  Ltd. 

Japan 

Amount:  $750. 

(3)  Dr.  Esen  A.  Ozkarahan 

Middle  East  Technical  University 
Ankara,  Turkey  ' 

Amount:  $500 

(4)  Mr.  J.P.  Sansonnet 
Universite  de  Paul  Sabatier 
Toulouse,  France 

Amount  $500. 


4.  Next  Workshop 


The  Workshop  Committee  met  on  May  28,  1980  and  decided  to  have 
another  workshop  because  of  the  attendance  beyond  expectation.  The  following 
are  decided. 


Date:  May  17-20  1982 

Location:  Fort  Lauderdale 
Program  Chairman:  Dr.  Lee  Hoevel 
Program  Vice  Chairman:  Dr.  George  Ligler 


5.  International  Participation 


The  Workshop  is  truly  international  as  there  were  participants  from 
12  countries:  Brazil,  Canada,  China,  France,  Ireland,  Italy,  Japan,  Sweden, 
Turkey,  United  Kingdom,  U.S.A.,  West  Germany. 
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6.  Official  Reports  Distribution  List 


Oefense  Documentation  Center 
Cameron  Station 
Alexandria,  VA  22314 

Office  of  Naval  Research 
Arlington,  VA  22217 

Information  Systems  Program  (437) 
Code  200 
Code  455 
Code  453 

Office  of  Naval  Research 
Branch  Office,  Boston 
Bldg  114,  Section  D 
666  Summer  Street 
Boston,  MA  02210 

Office  of  Naval  Research 
Branch  Office,  Chicago 
536  South  Clark  Street 
Chicago,  IL  60605 

Office  of  Naval  Research 
Branch  Office,  Pasadena 
1030  East  Green  Street 
Pasadena,  CA  91106 

Naval  Research  Laboratory 

Technical  Information  Division,  Code  2627 

Washington,  D.C.  20375 

Dr.  A.  L.  Slafkosky 
Scientific  Advisor 

Commandant  of  the  Marine  Corps  (Code  RD-1) 
Washington,  D.C.  20308 

Naval  Ocean  Systems  Center 
Advanced  Software  Technology *0i vision 
Code  5200 

San  Diego,  CA  92152 
Mr.  C.  H.  Gleissner 

Naval  Ship  Research  A  Development  Center 
Computation  and  Mathematics  Department 
Bethesda,  MD  20084 

Captain  Grace  M.  Hopper  (008) 

Naval  Data  Automation  Command 
Washington  Navy  Yard 
Building  166 
Washington,  D.C.  20374 


12  copies  - — 


2  copies 
1  copy 
1  copy 
1  copy 

1  copy 


1  copy 


1  copy 
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Leon  S .  Levy 

Bell  Telephone  Laboratories 
Whippany,  NJ  07981 
(201)  386-4955 


Tim  Merrigan 
Floating  Point  Systems 
P.0.  Box  23489 
Portland,  OR.  97223 
(503)  641-3151 


Hartmut  G.  Huber 

Naval  Surface  Weapon  Center 

Box  117 

Dahlgren,  Va.  22448 
(703)  663-8656(of f ice) 

(703)  775-7 04 6 (home) 

N.R.  Harris 

Stanford  University 

Computer  Systems  Lab 

Department  of  Electrical  Engineering 

Stanford  CA  94305 

(415)  497-3511 

Mary  Miller 

Bell  Laboratories 

30W062  Capistrano  Ct.  Apt.  302 

Naperville  IL  60540 

(312)  462-4269  (office) 

John  J.  Zaloudek 
Naval  Surface  Weapons  Center 
Dahlgren,  Va.  22401 
(703)  663-7368 

E.  Dean  Earnest 
Burroughs  Corporation 
25725  Jeronimo  Rd. 

Mission  Viejo,  CA  92691 
(714)  768-2321 

Heinz  Schlutter 

Gesellschaft  fur  Mathematik  und 
Datenverarbeitung,  MBH 
Postfach  1240 
Schloss  Birlinhoven 
D-5205  St.,  Augustin  1 
Bonn,  West  Germany 

Dr.  Klaus  Berkling 

(same  address  as  Schlutter) 

Qiorgio  Sofi 
CSELT,  VIA  REISS  R0H0LI 
Torino ,  Italy  10129 
tele.  21691 


Re in hard  G.  Kofer 
Siemens  AG,  ZFE-FL-SAR  112 
Otto  Kahn  Ring  6 
8  Muenchen  83  West  Germany 

Richard  C.  Fleming 
The  Aerospace  Corperation 
M.S.  A2/2043 
P.0.  Box  92957 
Los  Angeles  CA  90009 
(213)  648-7098 

Dr.  G.  U.  Merckel 
IBM  Dept.  24k  Bldg  032-3 
2000  NW  51  Street 
Boca  Raton'  FL.  33432 
(305)  994-47  63 

Melvin  Hallerman 
IBM  Dept.  24K  bldg  632-3 
2000  NW  51  Street 
Boca  Raton  FL.  33432 

Kerry  V.  Richmond 

McDonnell  Douglas  Astronautics  Co. 

P.0.  Box  516 

St.  Louis,  M0.  63166 

James  D.  Mooney 
West  Virginia  University 
Dept.  STAT.  &  COMP.  Science 
Morgantown,  WY.  26506 
(304  )  293-3  607 

Meir  Kaftor  M/S  B100 
Honeywell  Information  Systems 
P . 0 .  Box  6000 
Phoenix,  AZ.  85005 
(602)  866-3381 

Nobuyuki  Goto 
Toshiba  Corporation 
I  Komukai-Toshiba-cho,  Saiwai-ku 
Kawasaki,  Japan  210 
(044)  511-2111 
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List  of  Registrants  (Technical  Program) 


Jack  B.  Dennis 

MIT  Lab  for  Computer  Science 
545  Main  Street 
Cambridge ,  MA.  02139 
(617)  253-6856 


Harvey  G.  Cragon 
Texas  Instruments,  Inc. 
P.O.Box  225012 
Dallas,  TX  75265 
(214)  238-3023 


Mou-Shin  Yang 
Sustems  Emgineering 
6901  W.  Sunrise  Blvd, 

Ft.  Lauderdale  Fla.  33313 
(305)  587-2900  X6236 

Gilgert  J.  Hansen 
Texas  Instruments 
P.0.  Bax  222013,  MS  3407 
Dallas,  TX.  75222 
(214)  462-  4742 

Daniel  L.  Slotnick 
University  of  Illinois 
283  Digital  Computer  Lab 
Dept,  of  Computer  Science 
(217)  333-6726 

Terry  Welch 
Sperry  Research 
100  North  Rd. 

Sudbury,  MA.  01776 
(617)  369-4000 

Samuel  P.  Har bison 
Carnegle-__Mellon  University 
602A  Kelly  Ave 
Pittsburgh,  Pa.  15221 
(412)  731-  1472 

Charles  W.  Flink  II 

Naval  Surface  Weapon  Center 

K-74 

Dahlgren,  Va.  22401 
(703)  663-7517 

Bill  Kwlnn 
Hewlett  Packard 
3404  E.  Harmony  Road 
Fort  Collins,  CO  80525 
(303)  226-3800  X3242 

Jaishanker  Menon 
Dept  of  Computer  Science 
Ohio  State  University 
Columbus  Ohio  43210 
(614)  422-5813 


Leon  I.  Maissel 
IBM  Corp 

Dept.  C14,  Bldg  704, 

P.O.Box  390 
Poughkeepsie,  NY  12602 
(914)  463-2301 

v 

Raymond  L.  Phoenix 
IBM  Corp 

Dept.  C14,  Bldg  704, 

P.O.Box  390 
Poughkeepsie,  NY  12602 
(914)  463-5445 

Zvi  Weiss 

IBM  Research  Center 
Yorktown  Heights,  NY  10598 
(914)  962-7036 

Richard  Ramseyer 
Honeywell  SRC  Research 
2600  Ridgway  Pkwy,  MN17-2352 
Minneapolis,  MN  55413 
(  )  378-5023 

Tetstio  Ida 

Institute  of  Physical  &  Chen.  Res. 
2-1,  Hirosawa, 

Wako-shi,  Saitama  351 
Japan 

Greg  Bettice 
Naval  Vaionics  Center 
8125  Harrison  Drive 
Lawrence,  IN  46226 
(317)  353-3226 

Roger  R.  Bate 

Texas  Instruments,  Inc. 

P.O.Box  222013,  M/S  3407 
Dallas,  TX  75222 
(214)  462-4790 

Ron  Rutledge 
DOT/TSC,  P.O.Box  53 
Kendall  Square 
Cambridge,  MA  02142 
(617)  494-2038 
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List  of  Registrants  (Technical  Program) 


Gerhard  Herr sc her 
LITE? 

Loerracher  Strasse  18 
7800  Freiburg 
West  Germany 
0761-4901212 

A.  Speckhard 
Aerospace  Corperation 
2350  E.  El  Segundo  Blvd. 

El  Segundo  CA  90245 
(213)  648-7067 

John  Francis  „ 

Sanders  Associates*  Inc. 

95  Canal  Street 
Nashua  NH  03060 

(603)  885-3746 

Paula  Bernstein 
Bell  Laboratories 
Warrenville-Naperville  Rds. 

Nap  erville  IL  60540 
(312)  462-2898 

R. F.  Hobson 

Simon  Fraser  University 

S. F.  University 

Computer  Science  Department 
Burnaby  British  Columbia  VSAIS6 

(604)  291-4277 

Dr.  Werner  Kluge 
GMD/ISF 
Postfach  1240 
SchloB  Blrllnghoven 
West  Germany 

Malcolm  Muir 
Da  tamed lx,  Inc. 

555  Hillsboro  Plaza 
Deerfield  Beach  FL  33441 
(305)  428-4526 

Ronald  L.  Engelbrecht 
NCR  Corp.  -  E&M-Wichita 
3718  N.  Rock  Road 
Wichita  KS  67218 
(316)  688-8646 

Dr.  F.J.  Burkowski 

Computer  Science  Department 

University  of  Manitoba 

Room  545  Machray  Hall 

Winnipeg  Manitoba,  Canada  R3T  2N2 

(204)  47408313 


Allen  Baum 

Hewlett-Packard 

HPL/CRL 

1501  Page  Mill  Rd. 

Palo  Alto  CA  94304 
857-8776 

Keljl  Kuwahara 

Nikkei -McGraw-Hill 

2-1-2  Uchikanda,  Chiyoda-ku 

Tokyo  Japan 

(03)  256-1561 

Y.  El-zi* 

Honeywell 
Honeywell  Plaza 
Minneapolis  Mlnnlsota  55408 

David  E.  Heinen 
Tektronix,  Inc. 

P.O,  Box  500  DS  63-311 
Beaverton  OR  97077 
(503)  682-3411  x3845 

Lawrence  Katz 
Tektronix,  Inc. 

P.O.  Box. 500  DS  63-311 
Beaverton  OR  97077 
(503)  682-3411  x3081 

R.  Curtis 
Cenlsius  College 
2011 .Main  Street 
Buffalo  NY  14208 
Ol*'  831-7000 

John  Bowles 

NCR  Corporation 

3325  Platt  Springs  Rd. 

C.  Columbia  SC  29169 
(803)  796-9250  x524 

David  M.  Abraham  son 
Department  of  Computer  Science 
Trinity  College 
Dublin  2  Ireland 
772941  Ext.  1765 

Hugh  L.  Applewhite 
Honeywell  17-2352 
2600  Rldgway  N.E. 

Klnnea  polls,  MN  55413 
(612)  378-4510 
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List  of  Registrants  (Technical  Program) 


David  K.  Hsiao 
Ohio  State  University 
Department  of  Computer  Science 
Columbus  Ohio  43210 
(614)  422-5813 

M.  Tsuchiya 
TRW  DSSG  E2/2036 
One  Space  Park 
Redondo  Beach  CA  90278 

(213)  535-0580 

Dr.  William  D.  Murray 
University  of  Colorado 
1100  14th  Street 
Denver  00  80202 
(303)  629-2872 

Bantwal  &.  Kb au 

Coordinated  Science  laboratories 
University  of  Illinois 
Urbans  IL  61801 
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Leonard  Haynes 
Office  of  Haval  Research 
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Yaohan  Chu 

University  of  Maryland 
Department  of  Computer  Science 
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Southern  Methodist  University 
Department  of  Computer  Science 
Dallas  TX  75275 

(214)  692-3095 

Barry  C.  Goldstein 
IBM  T.J. Watson  Research 
24  Glen  Terrace 
Chappaqua  MY  10514 
(914)  945-2693  (office) 

David  A«  Patterson 
University  of  California 
Electrical  Engineering  and 
Computer  Sciences 
Computer  Science  Division 
Berkeley,  California  94720 

G»T*  Ligler 
Burroughs  Corp* 

P.0.  Box  517 
Paoli,  PA  19301 
2A5-6U0-32l*8 


Robert  P.  Gnelik 
Beil  Laboratories 
Room  7D-414 
600  Mountain  Ave. 

Murray  BUI  NJ  07974 
(201)  582-5797 

David  R.  Dltzel 
Bell  Laboratories 
2C-523 

Murray  Hill  NJ  07  974. 

(201)  582-3655 

Thomas  A.  Almy 
Tektronix,  Inc.  M/S  50-384 
Box  500 

Beaverton  OR  97077 
644-0161  x6056 

Herman  Bar  tig 
Unlversltat  Karlsruhe 
Institut  fur  Informatik  IV 
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Postfach  6380 
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31400  Toulouse,  France 

Jean-Paul  Sansonnet 
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Royal  Institute  of  Technology 
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Royal  Institute  of  Technology 
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Dennis  A.  Roberson 
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Dick  Conn 
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HIGH-LEVEL  LANGUAGE  COMPUTER  * 
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Abstract 


This  pspsr  presents  an  Internal  language  for 
a  high-level  language  compter  to  facilitate 
parallel  execution  of  arithmetic  expressions  and 
concurrent  statements  and  to  perform  try-ahead 
operations  for  IF,  WHILE,  and  REPEAT  statements* 
The  architecture  of  such  a  computer  is  also 
described,  which  consists  of  multiple  independent 
processors  for  language  procesalng  and  parallel 
computation*  The  increase  In  spaed  Is  achlevod  by 
parallel  execution,  by  try-ahead  proceasing,  and 
by  the  pipeline  effect  created  by  the  independent 
processors  simultaneously  performing  verloui* 
tasks.  An  algorithm  that  translates  an  arithmetic 
expression  lntc  the  internal  language  fora  la  also 
Included  in  the  send lx. 


I.  Introduction 

In  the  area  of  high-level  computer 
architecture,  various  machine  organisations  have 
been  proposed  with  features  to  increase  the 
program  processing  speed  tl]  [2].  These  designs 
Include  Independent  processors  to  perform  various 
tasks  in  language  translation  and  execution,  such 
as  the  lexical  processor,  syntactic  processor, 
semantic  processor,  arithmetic  processor,  etc* 
These  processors  operate  simultaneously  and 
asynchronously,  and  create  a  pipeline  effect  in 
the  vhole  system.  The  concurrency  among  these 
processors  results  In  the  speed  Increase  in 
language  translation  and  execution. 

In  this  paper,  however,  we  look  Into  another 
possibility  of  gaining  opeed  in  high-level 
language  computers,  namely,  the  parallel  execution 
of  arithmetic  expressions  end  concurrent 
statements,  and  the  try-ahead  processing  of 
statements  involving  conditions,  such  as  IF, 
WHILE,  and  REPEAT.  The  scheme  that  we  uoa  here 
calls  for  an  indirect-execution  architecture  which 
uses  an  Internal  language  and  Is  of  type  3 
according  to  Chu's  classification  [3].  Source 
programs  are  translated  into  the  internal 
representation,  which  is  then  Interpreted  by  the 


*  Research  reported  herein  was  supported  In  part 
by  NSF-HCS-77-23496. 
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machine  hardware.  We  will  describe  first  the 
features  in  the  Internal  language  that  make 
parallel  and  try-ahead  operations  possible,  and 
then  the  co^iuter  organisation  for  carrying  out 
these  operations*  We  will  discuss  only  the 
features  in  the  lntarnsl  language  that  are 
relevant  to  parallel  execution  end  try-ahead 
procesalng,  and  Ignore  others  such  as  Identifiers, 
labels,  etc.,  since  they  are  immaterial  to  the 
purpose  of  this  paper  and  they  can  be  found  in 
other  papers,  a.g.  [1]  [4]*  The  syntax  and 
semantics  of  the  high-level  language  constructs 
are  the  same  as  those  in  PASCAL* 

In  Section  2.1,  ws  first  briefly  describe  the 
notion  of  Parallel  Execution  Strings  (PES)  for 
executing  arithmetic  expressions  In  parallel,  and 
then  propose  a  linear  representation  scheme  as  the 
internal  language  for  a  high-level  computer.  An 
algorithm  which  translates  an  arithmetic 
expression  into  the  internal  language  form  is 
included  in  the  Appendix.  In  Section  2.2,  we 
present  a  method  of  representing  a  concurrent 
statament  in  the  internal  language  so  that  it  can 
be  executed  concurrently.  In  Sections  2.3  through 
2.5,  we  describe  the  representation  of  IF 
statements,  WHILE  statements,  and  REPEAT 
statements  in  the  internal  language  for  try-ahead 
processing.  The  representation  allows  the 
possible  paths  in  a  statement  Involving  a 
condition  to  be  executed  even  before  the 
evaluation  of  the  condition  is  completed. 
Finally,  a  high-level  computer  organisation  Is 
presented  in  Section  III,  which  Includes 
Independent  processors  for  languags  proceed  ug, 
end  multiple  Semantic  Processors  and  PES  Access 
Processors  for  parallel  computations.  In  the 
computer  organisation,  each  execution  stream  is 
accessed  and  executed  by  a  PES  Access  Processor 
and  a  Semantic  Procsaaor.  Each  Semantic  Processor 
has  its  own  Arithmetic  Processor  and  Local  Storage 
for  concurrent  procesalng  and  try-ahead 
processing. 


H.  Internal  Lanauata  Constructs 
2 • 1  Arithmetic  Expressions  for  Parallel  Execution 

A  scheme  for  decomposing  arithmetic 
expressions  for  parallel  execution,  called  the 
Parallel  Execution  String  (PES),  has  been  proposed 
in  [5]  16],  It  can  be  summarised  as  follows. 


:  ... 


Definition 

In  an  expression  tree,  an  oparator  node  la 
called 

type  I  —  11  all  o 1  lta  operand a  are 
varlablaa  or  constants; 

type  2  —  if  exactly  one  of  lta  operanda  la 
an  operators  and 

type  3  —  If  it  la  a  binary  oparator  and 
both  of  lta  oparanda  are 
operators • 

Conaldar  an  expression  In  its  tree 
representation.  Those  operator  nodes.  the 
operands  of  which  are  variables  or  constants 
(l.e.,  type  1).  will  be  the  starting  points  of  tha 
parallel  execution  strings.  Beginning  at  the 
starting  points,  thess  strings  are  executed  la  the 
direction  toward  the  root  node,  eech  of  which  can 
ba  simultaneously  executed  by  an  Independent 
processor.  Each  processor  exscutas  tha  type  1 
operators  In  a  string  ons  by  one  at  Its  naxlnua 
spaed  without  waiting.  At  an  operator  node  where 
two  strings  naet  (l.e.,  type  3),  the  processor 
which  reaches  this  nods  first  will  deposit  tha 
partial  result  It  obtains  thus  far  Into  a 
temporary  storage  and  then  stop,  whereas  tha  other 
processor  which  reaches  this  node  later  will 
execute  the  operation  at  tha  merging  node  and 
continue  to  execute  tha  remaining  string.  For 
example,  the  expression  tree  In  Figure  1  has  thras 
type  1  nodes:  A+B,  CM),  and  C-H;  and  hence  there 
are  three  parallel  execution  strings.  Tha  two 
type  3  nodes  In  Figure  1  are  labeled  as  #1  and  #2, 
respectively.  Note  that  the  number  of  type  3 
nodes  Is  always  one  less  than  the  nuaber  of  type  1 
nodes. 


The  expression  J-(A+B)*(CMHE-F/ (0-H) ) 
can  be  represented  aa  a  tree: 


To  iaplenant  this  concept  in  a  high-level 
language  conputar,  we  have  to  devise  a  linear 
representation  for  tha  parallel  execution  strings 
In  an  expression  tree  and  use  It  as  the  Internal 
language  for  the  high-level  language  computer. 
With  this  internal  language,  the  entry  points  of 
the  strings  ars  chained  aa  a  linked  list  by 
pointers  called  Parallel  Pointers .  For  the 
operator  where  two  strings  meet,  one  of  Its  two 
operands  la  tha  raault  of  tha  previous  operation 
In  the  processor  and  hencs  need  not  be  specified, 
end  the  other  operand  Is  represented  by  #1,  where 
1  Is  a  unique  nuaber  identifying  a  temporary 
storage  for  the  partial  result  obtained  by  the 
processor  axscutlng  ths  other  string.  The  first 
of  the  two  Barging  strings  has  a  Jump  Pointer 
following  the  merging  point  operator  and  pointing 
to  tha  location  that  1  Mediately  follows  the 
merging  point  operator  In  the  second  string. 

To  ellmlnata  the  need  of  a  stack  during  the 
execution  of  arithmetic  expressions,  tha  ordering 
of  operanda  will  ba  reversed  in  the  following 
situation:  whan  the  result  of  tha  previous 
operator  Is  ths  second  operand  of  the  currant 
operator,  tha  first  operand  will  appear  as  tha 
second  operand  In  this  representation.  Thus,  If 
the  operator  la  non-commutative,  It  will  be  marked 
with  an  apostrophe  following  the  oparator  to 
indicate  that  the  ordering  of  its  operands  Is 
reversed. 

Figure  1  is  an  example  of  representing  an 
arithsMtic  expression  la  the  internal  language. 
In  Figure  1,  Para  represents  a  Parallel  Pointer, 
and  Jugg  a  Jump  Pointer.  When  a  P'dS  Access 
Processor  exacutes  a  Parallsl  Pointer  (see  Section 
III  and  Figure  3,)  It  will  put  the  pointer  value 
Into  ono  of  the  Entry  Point  kaglsters  so  that  the 
next  string  can  be  chosen  for  execution  as  soon  as 
anothar  FES  Access  Processor  becomes  frse. 


2.2  Concurrent  Statements 
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C  D  G  H 

It  can  be  translated  into  the  internal  language  ns: 


Para  A  B  +  #  l  *  Juap^ 

fpti  C  D  *  E  +  #2  -  Jump 

. r* 


\ 


G  H  -  F  I*  #2  #1  *  J  -' 


Flg.l  Example  of  Tranalatlng  an  Expression 
Into  the  Internal  Language 


A  concurrent  statement  [7]  is  a  set  of 
statements  enclosed  by  a  header  COBZGIN  end  a 
trailer  COEND;  for  example, 


COBEGIIi 


Statement  I; 
statement  2; 


Statement  n 


COEND 


The  statements  in  a  concurrent  statement  can 
be  executed  simultaneously.  A  flowchart  of  the 
above  concurrent  stetement  le  ehown  In  Figure  2. 
To  execute  the  concurrent  statement,  It  wlLl  be 
translated  into  the  Internal  language  ae  follows; 


COBEGIN  Para  Statement 


Para  Statement  2 
?ar.i  ....;  Statement  n  COEND 


\  '‘ill  1 1  L.'., 1 


Vad  X  i'.-JXv  -Vj ,  A. 


Figure  2  Flowchart  of  a  Concurrent  Statement 


processor  executes  the  symbol  "THENEND",  or  when 
the  ELSE  state  processor  executes  the  ayabol 
"ELSEND",  the  processor  will  halt  Its  execution  In 
the  WAIT  state.  However,  "THENEND"  and  "ELSEND" 
will  have  no  effect  on  a  processor  which  Is  In  the 
normal  mode  of  operation. 

When  the  first  processor  executes  the  symbol 
"IF",  It  Interrupts  both  the  second  and  the  third 
processors.  Depending  upon  the  result  of  the 
conditional  expression,  It  makes  one  of  the  two 
processors  free  immediately  and  discards  any 
computation  the  processor  has  done;  any 
environment  changes  made  by  the  other  processor 
are  copied  into  the  nain  storage  and  the  processor 
becomes  free.  When  that  Is  dona,  the  first 
processor  resumes  execution  froa  where  the  latter 
processor  was  Interrupted. 


The  proceesor  that  executes  Statement  n  will 
execute  COEND.  The  effect  of  executing  COEND  Is 
that  the  processor  will  halt  Its  exscutlon 
teepora.il/  until  all  the  other  processors  becoms 
free. 


The  aealcolons  In  s  concurrent  statement  will 
be  preserved  In  the  internal  language.  A 
semicolon  Indicates  the  end  of  e  staple  statement 
in  a  concurrent  statement  and  hanca  aakas  the 
proceaaor  which  la  axccutlng  the  simple  stateaant 
free. 


2.3  IF  Statements 


IF  stataaents  will  be  processed  with  a 
try -ahead  aathod.  Tha  following  IF  stateaant 

IF  condition  THEN  atataaent  1  ELSE  stateaant  2; 

will  be  tranelated  Into  the  Internal  language  aa 
follows! 


Par  (^condition  IP  .Para  THEN  stateaent  1  THENEND 


jlxSE 


=£3 


-umov  ELSE  statement  2  ELSEND^ 


Tha  proceaaor  wh; ch  starta  executing  the  IF 
statement  will  aat  up  tha  entry  to  the  THIN  clause 
for  a  aacond  proceaaor,  which  in  turn  sets  up  the 
entry  to  tha  EL8E  clause  for  a  third  processor. 
While  tha  first  procaaaor  is  evaluating  tha 
conditional  expraeelon,  both  tha  statement  1  and 
the  etatemeut  2  are  being  executed  simultaneously. 
However,  any  environment  changes  resulting  froa 
the  execution  of  the  stateaant  1  and  tha  statement 
2  are  kept  in  tha  local  storage  of  the  second  and 
the  third  processors,  respectively,  and  will  have 
no  affact  althar  on  the  execution  of  the  other  or 
on  tha  evaluation  of  the  conditional  expression. 

Executing  tha  eyabols  "THEN"  and  "ELSE" 
cauaaa  tha  procaaaor  to  enter  tha  THEN  state  and 
tha  ELSE  state,  respectively.  Whan  the  THEN  state. 


2.4  WHILE  Statements 


WHILE  statements  end  REPEAT  statements  will 
be  processed  with  the  try-ahead  method  duller  to 
that  for  IF  statements.  However,  only  tha 
repetitive  path  will  be  tried  In  advance, 

Tha  WHILE  stateaent 


WHILE  condition  DO  stateaent  1; 


will  be  translated  as! 
Pare,  condition  WHIlS”"" 


jWHIL 


EDO  statement!  WH1LEND  J 


The  processor  which  axacutes  the  conditional 
expression  sets  up  tha  entry  to  the  WHILBDO  path 
for  a  second  processor.  The  conditional 
expression  end  the  the  WHILEDO  path  are  than 
executed  simultaneously. 

Executing  the  ayabol  "VHILIDO"  forcea  tits 
second  proceaaor  to  enter  tha  WHILEDO  state,  Tha 
environment  changes  made  by  a  WHILEDO  state 
processor  do  not  affect  tha  main  storage  and  are 
only  kept  In  tha  local  storage  of  the  processor, 
A  WHILEDO  stats  processor  will  halt  its  execution 
in  tha  WAIT  stats  whan  it  axacutas  the  symbol 
"WH’LIND."  However,  tbs  "WHILEMD"  will  have  no 
•ffset  on  a  procaaaor  which  is  in  tha  normal  mode 
of  operation. 


Whan  tha  first  processor  axacutaa  tha  symbol 
"WHILE,"  it  interrupts  tha  aacond  processor.  If 
the  result  of  tha  conditional  expression  is  FALSE, 
the  eecond  proceesor  becomes  free  immediately  and 
every thing  in  lta  local  atoraga  will  not  be  used, 
Tha  first  processor  than  follows  tbs  WHILE  pointer 
to  axacuta  tha  next  statement. 

If  the  result  la  TRUE,  tha  envlronmant 
changes  stored  in  tha  local  storage  of  the  aacond 
procaaaor  will  be  copied  into  tha  main  storage, 
and  tha  second  procaaaor  becomes  free.  The  first 
processor  than  resumes  execution  froa  where  the 
second  procsssor  was  Interrupted. 
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2.S  EJJEAT  Statements 


The  REPEAT  statement 


REPEAT 

atatsmsnt  1; 
statemant  2; 


Haynes  [1]  except  that  multiple  Identical 
procaaaora  ara  also  used  fat  parallel 
computations.  Since  we  are  interested  only  in  the 
parallel  execution  aspects  of  the  architecture, 
other  features  which  are  the  aaae  as  Kaynet  (Lj 
will  not  be  duplicated  here. 


PES  Access  Proceaaora 


statement  n 

UNTIL  conditions 


will  be  translated  as) 
statement  1  statemant  2 


statement  n 


V 


Para  condition  UNTIL 


REPEAT 


REPEATEND 


The  processor  which  executes  tho  conditional 
expression  sets  up  the  entry  to  the  REPEAT  path 
for  a  second  proceeeor.  The  conditional 
expression  and  the  REPEAT  path  ara  then  executed 
simultaneously.  Executing  ths  symbol  "REPEAT" 
forces  the  second  processor  to  enter  the  REPEAT 
stste.  The  environment  changes  made  by  a  REPEAT 
state  processor  do  not  affect  the  main  storage  and 
are  only  kept  In  the  local  storage  of  the 
processor, 

Tha  symbol  "REPEATEND"  la  added  to  the 
statement.  Execution  of  "REPEATEND"  will  hava  no 
affect  on  a  processor  which  Is  In  the  normal  mode 
of  operation.  Once  a  REPEAT  state  processor 
executes  the  "REPEATEND",  It  will  halt  Its 
execution  In  the  WAIT  stats* 

When  the  first  processor  executes  the 
"UNTIL,"  it  Interrupts  the  second  processor.  It 
the  result  of  tha  conditional  expression  la  TRUE, 
the  second  processor  becomes  free  Immediately, 
The  first  processor  then  follows  tha  UNTIL  pointer 
to  execute  tha  next  statement. 


Tha  PBS  Memory  stores  the  internal 
repraaantation  of  the  source  progress.  During  the 
translation  phase,  the  FES  Access  Processor 
receives  program  tokens  In  the  Internal  form  from 
tha  associated  Syntactic  and  Semantic  Processor, 
assembles  and  stores  them  Into  the  PES  Memory, 
During  the  execution  phase,  each  PES  Access 
Processor  reada  the  program  from  the  PES  Memory, 
separates  and  delivers  tha  symbols  to  ths 
aaaoclatad  Syntactic  and  Semantic  Procasaor  that 
it  Is  attached  to,  A  free  PES  Accese  Processor 
will  start  executing  a  (parallel )  execution  string 
by  using  a  non-empty  value  from  one  of  tha  Entry 
Point  Registers  as  a  starting  address  In  the  PES 
Memory  for  exacutlon.  After  that  the  Entry  Point 
Raglster  is  cleared, 

Tho  PES  Access  Processor  can  continue  reading 
from  tha  PES  Memory  until  either  Its  buffers  ara 
full  or  It  has  read  a  semicolon,  which  Indicates 
the  end  of  a  simple  statement  In  a  concurrent 

statement. 

Parallel  Pointers  and  Jump  Pointers  are 
executed  by  PES  Access  Processors,  When  a  PES 
Access  Processor  reads  a  Parallel  Pointer,  It  puts 
the  pointer  value  and  Its  processor  Identification 
into  one  of  tha  Entry  Point  Registers  and 
continues  Its  processing.  When  a  PES  Access 
Processor  reads  a  Jump  Pointer,  it  simply  alters 
Its  program  counter  and  reads  the  program  from  tha 
new  location. 


Syntactic  and  Stmantlc  Processors 


If  the  result  la  FALSE,  tha  environment 
changes  stored  In  the  local  storage  of  tha  second 
processor  will  be  copied  Into  the  main  storage, 
and  the  aecond  processor  becomes  free.  The  first 
processor  then  resumes  Its  execution  from  where 
tha  aacond  processor  was  Interrupted, 


in*  ALsMlisiaia 


The  architecture  of  a  high-level  language 
computer  which  can  axecute  the  Internal  language 
as  described  In  Section  II  la  shown  in  Figure  3. 
It  coualata  of  PES  Memory,  Main  Memory,  Partial 
Result  Storage,  a  Scanner,  an  I/O  Processor,  and  a 
number  of  PE8  Access  Processors,  Entry  Point 
Ragle ters,  Syntactic  and  Semantic  Processors, 
Local  8torage,  and  Arithmetic  Processors,  Tha 
various  kinds  of  proceaaora  are  operating 
simultaneously  In  a  pipelined  manner,  and  tha 
organisation  Is  similar  to  tha  one  proposed  by 


Each  PES  Access  Processor  Is  attached  to  a 
Syntactic  and  Semantic  Processor,  The  PES  Access 
Processor  and  its  associated  Syntactic  and 
Semantic  Procsssor  are  operating  concurrently  and 
asynchronously.  The  communication  between  them  la 
carried  out  by  tha  buffers  In  tha  PES  Access 
Procsssor  and  a  counting  samaphorn.  During 
translation,  tha  Syntactic  and  Semantic  Procasaor 
receives  program  tokens  from  tha  Scanner,  performs 
syntax  analysis,  translates  ths  program  Into  the 
Internal  language,  end  delivers  the  resulting 
program  to  the  PBS  Access  Processor,  During  the 
execution  phase,  the  Syntactic  end  Semantic 
Procnasor  execute*  various  type*  of  operators  sent 
by  Its  PIS  Access  Processor,  such  as  IP,  THEN, 
ELSE,  BEGIN,  WHILE,  EBP EAT,  etc.  It  also  sends 
coMMnds  to  its  Arithmetic  Processor  and  tha  I/O 
Procaaaor,  A  Syntactic  and  Semantic  Procasaor  can 
clso  altar  the  program  counter  of  it*  PES  Access 
Processor  when  It  executes  s  "GOTO",  "WHILE",  or 
"UNTIL,"  Each  Syntactic  end  Semantic  Processor  has 
Its  own  locel  aarnory  to  tsmporarlly  stors  tha 
environment  changes  during  try-ahead  processing. 
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Tha  Syntactic  and  Semantic  Processors  art 
Interconnected  to  aach  othar  ao  that  whan  a 
try-ahaad  path  la  takan  by  a  Syntactic  and 
s auntie  Procaaaor ,  it  can  aaod  lta  procaaaor 
Identification  to  the  procaaaor  which  la  executing 
tha  conditional  axpraaalon •  Altar  tha  conditional 
expression  la  a value tad,  tha  lattar  procaaaor  will 
Interrupt  tha  former  and  taka  tha  appropriate 
actions  aa  deacrlbed  In  Section  II. 


An  Arlthutlc  Procaaaor  la  coanactad  to  aach 
of  tha  Syntactic  and  Semantic  Procaaaora.  Whan  a 
Syntactic  and  Sauntlc  Procaaaor  racalvaa  an 
operand  from  lta  PSS  Accaaa  Procaaaor,  It  aavaa 
tha  type  and  value  of  tha  operand  Into  lta  operand 
reglatere.  When  It  racalvaa  an  arlthutlc 
operator,  it  dlrecta  lta  Arlthaetic  Procaaaor  to 
perfora  tha  operation  on  tha  operand*  atorad  In 
lta  operand  registers.  Tha  Arlthutlc  Procaaaor 
will  check  tha  typea  of  tha  operanda,  and  perfora 
all  type  convaralona  If  uedad.  Tha  raaulta  of  an 
arlthaetic  operation  are  atorad  Into  tha  operand 
reglatara  of  tha  Syntactic  and  Saa antic  Procaaaor 
which  haa  earn  the  operator.  Our  achaaa  uaad  hare 
will  not  require  any  atack  for  arlthutlc 
axpraaalon  axacutlona,  and,  at  any  tlaa,  no  aora 
than  two  operanda  will  ba  In  tha  operand  reglatara 
of  a  Syntactic  and  Sauntlc  Procaaaor.  A  atack  la 
uaad  In  tha  uln  atoraga  only  to  allocate  apaca 
whan  a  block  or  procedure  la  entered. 


Tha  Partial  Result  Storage  la  to  temporarily 
a tore  tha  partial  raaulta  obtained  during  the 
execution  of  an  axpraaalon.  Each  location  In  the 
Partial  kaault  Storage  haa  a  tag  aeeoclated  with 
It  to  Indicate  whether  It  la  eapty  or  full.  All 
taga  are  cleared  Initially  to  Indicate  "anpty." 
Whan  a  Syntactic  and  Sauntlc  Procaaaor  racalvaa  a 
partial  raault  operand,  l.e.,  an  operand  of  the 
form  #1,  froa  lta  PBS  Accaaa  Procaaaor,  it  will 
check  tha  tag  of  location  1  in  tha  Partial  Kaault 
Storage.  If  It  Indicates  "empty",  tha  Syntactic 
and  Suantic  Procaaaor  will  aava  tha  contents  of 
its  operand  reglatara  luto  location  1  of  the 
Partial  Kaault  8torage  and  set  tha  tag  to  Indicate 
"full".  Tha  Syntactic  and  8euntlc  Procaaaor  than 
becoaea  free.  If  tha  tag  Indicates  "full,"  the 
Syntactic  and  Sauntlc  Procaaaor  will  reset  tha 
tag  to  Indicate  "empty",  raad  tha  content*  of 
location  1  Into  Its  operand  reglatara,  and  uae 
than  as  tha  operand  for  the  next  operation. 


atataunta.  For  an  IP  atataaant,  both  tha  THtt  \ 
path  and  tha  ELSE  path  ar*  triad  simultaneously, 
while  tha  eonditloual  expression  la  being 

executed.  The  wrong  path  la  later  discarded,  aad  ' 
the  right  path  activated.  for  WHILE  statements  ' 
and  X  BP  SAT  atataunta,  only  tha  repetitive  path  la  ' 
triad  ahead,  since  It  is  the  one  aora  likely  to  b* 
correct.  The  resulting  aystaa  can  increase  lta 
procaeelng  apaad  over  other  daalgna  through 
distributed  processing  of  various  teaks  by 

multiple  Independent  procaaaora.  through  parallel 
execution  of  arlthutlc  expressions  and  concurrent 
atataaant*,  and  through  try-ahaad  processing  of 
tha  atataunta  Involving  condition*. 
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In  this  paper  wa  have  presented  an  internal 
language  for  a  high-level  language  conputer.  In 
which  arithmetic  expressions  and  concurrent 

atataaant*  ar*  expressed  aa  parallel  executable 

strings.  Try-ahaad  operations  are  perforaad  for 
IP  statement*,  WHILE  statement*,  and  REPEAT 


Appendix 


A  TranBletlPn_AlKQtithm 


The  algorithm  la  to  translate  an  arithmetic 
expression  Into  tha  Intarnal  languaga  form 
daaerlbad  In  Sactlon  2.1.  During  tha  tranalatlon 
prooaaa.  two  atacka  will  ba  uaadt  OPR-STK  and 
OPN-STK,  for  atorlng  oparatora  and  oparanda, 
respectively.  Dollar  algna  ($)  will  alao  bt  uaad 
In  OPN-STK.  LC  la  tha  Location  Countar  which 
contalna  tha  addraaa  of  tha  location  for  atorlng 
tha  naxt  output.  Two  varlablaa  ara  uaadt 
TBfP-COUNTER  la  for  tha  numbar  of  temporary 
storage  locations  uaad.  and  PES-BEC1N  la  tha 
starting  address  of  the  atrlng  currently  being 
generated.  An  array  TEMP-POINTER  (TP)  Is  uaed  In 
the  algorithm.  TP(1)  stores  tha  addraaa  of  the 
first  of  tha  two  li'a  In  tha  output,  so  that  when 
the  second  #1  la  generated,  a  Jump  Pointer  tu  the 
second  fl  can  ba  generated  at  the  location 
following  tha  first  #1. 

Tha  algorithm  la  elmllar  to  that  o( 
translating  an  expression  into  a  reverse  Polish 
string,  axcapt  that  oparanda  ara  not  written  out 
immediately  and  Its  operator  output  procedure  la 
more  complicated,  A  hardware  translator  can  be 
easily  Implemented  In  tha  Syntactic  and  Semantic 
Processor  (8].  Figure  4  la  tha  flowchart  of  the 
algorithm. 


Main  Procedure 

1.  Clear  TP  array.  Initialise  TKMP-COUHTKK  4- 
0}  PES-BEGIN  W-  LC. 

2.  S  a-  next  Input  symbol. 

3.  If  S  Is  a  then  push  OPR-STK('(')  and  go 

to  2, 

else  If  S  la  a  variable,  then  push 
OPN-STK(S) , 

alas  ERROR. 

4.  S  a-  next  Input  symbol. 

3.  WHILE  Priority (OPR-TOP)  >  Prlority(S)  DU 
POP-0PR-STK. 

6.  If  S  is  a  ')'  and  OPRTOP-'(',  then  pop  OPR-STK 
and  go  to  4. 

If  S  la  a  ')'  and  OPRTOP  la  not  '(',  ERROR. 

7.  If  S  Is  an  arithmetic  operator,  then  push 
OPR-STK(S)  and  go  to  2. 

8.  If  S  la  'and-of-expression'  and  OPR-TOP  la  not 

a 

then  DONE  else  ERROR. 


Procedure  POP-OPR-STK 

Case  1  The  OPR  being  popped  la  a  unary 
operator! 

Casa  1.1  OPN-STK(TOP)  la  a  variable.! 

1.  FINISH-PREVIOUS-PNS • 

2.  Pop  OPN-STK,  and  nut  put  It. 

3.  Output  OPR. 


4.  Push  $ (TEMP -COUNTER  +  1)  onto 
OPN-STK. 

Caau  1.2  OPN-STK(TOP)  la  a  $k! 

1 .  Output  OPR. 

Case  2  The  OPK  being  popped  la  a  binary 

operator.  Depending  upon  tha  top 
two  alamanta  on  OPN-STK,  thara  are 
three  caaasi 

Casa  2.1  Botii  of  tha  two  alamanta  are 
variables! 

1.  FIN 1SK-PKEV 10US-PES • 

2.  Output  tha  top  two  alamanta  from 
OPN-STK. 

3.  Replace  the  top  two  elements  on 
OPN-STK  by  $ (TEMP -COUNTER  +1). 

4.  Output  OPK. 

Case  2.2  One  element  is  a  variable,  .mil 
the  other  is  a  $i! 

1.  Output  tha  variable. 

2.  If  OPR  ia  noil-commutative  and 
Ol'N-STK(TUP)  la  $1,  then  output 
OPR',  alas  output  OPR. 

3.  Rsplaca  tha  top  two  alamanta  on 
OPN-STK  by  Si. 

Caee  2.3  both  of  tha  two  alamanta  are 

$'a.  Lat  OPN-STK(TOP-l)  ba  $k,  and 
let  OPN-STK (TOP)  ba  Jjt 

1 .  Output  Ik. 

2.  If  OPK  la  non-commutative,  than 
output  OPR',  alas  output  OPR. 

3.  Output  OPR  tu  the  location  pointed 
to  by  TP(K) . 

4.  Output  a  .lump  Pointer  with  the 
content  uf  I.C  to  thu  locution 
pointed  to  by  TP(K)+l. 

3.  Replace  the  top  two  elemente  nil 
OPN-STK  by  Sj. 


Procedure  F1NISH-PREV10US-PES 

1.  TEMP-COUNTER  4-  TEMP-COUNTER  +  1. 

2.  Output  I (TEMP-COUNTER). 

3.  TL(TEHP-CuUNTER)  4-  LC;  increment  LC  by  2. 

4.  Output  a  Parallel  Pointer  with  the  content  ot 
LC  to  the  location  addressed  by  PES-BCCIN. 

3.  PES-BECIN  4-  LC. 


/  ;»  ’A-i  •>  lit?:5. 


3BgBiaKEiV.'ffi 


KUSWsrvwin  -  r.TO'arf  -_■•*> 


•  >  •:  ip*  i  !1«1-  .Iim-  Or>,.mi  /..it  1 1 


Vi 


kiifi  „ IkjL-frt 


.-.-^--l  '■  >.  ...  . .  .1..  -.!  ....  -M.  ,-  ■•>  .-fcaV^ 
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Abstract 

Thia  paper  presents  n  o^nroptuai 
design  of  the  direot-execut ion  Fortran 
Computer.  first  some  modifications  are 
introduced  into  ths  languut.e  fortran 
to  insurs  simpler  execution  and  better 
performance.  Next  folio**  n  brief  Jia- 
cuasion  of  tb*  arohitaotur*  of  this 
computer  and  then,  in  i >ore  uutnu  ,  of 
tnu  direct-execution  proenaj  of  uomo 
typical  Fortran  atatonanta  which  ;:my 
iurniah  an  outline  ol  thu  worn  of  thia 
computer,  finally,  soi.iu  coiuiuuiitj  ore 
cad*  on  the  poaaiblu  development  of 
the  ilreot-exeoution  hlgh-luver  language 
computer. 


it  tair&lL*, *■££.. 

.with  the  rapid  adranca  of  the  science  and 
technology  of  computers  and  alsctronica,  tha 
coat  of  thu  hnrdeare  becomes  chopper  and  that 
of  the  software  becomoa  more  expensive  lay  liy 
day.  Thia  mnkea  It  both  ooaaibi*  and  nocwsaory, 
to  design  tha  diract-axaoution  high-level  lan¬ 
guage  computer.  Ulnce  tha  language  fortrnr.  is 
the  meat  widely  used  hlgh-loToi  laogunpo,  thu 
.'.oeonrch  and  design  of  the  iiroct-exucdtion 
fortran  computer  may  not  b«  111-nlvisotl. 

Phis  paper  give*  n  concept  Jnl  Jvji,;n  of  the 
J irect. execution  fortran  computer.  It  uses  the 
AUdl  basic  Fortrnn  as  the  fundamental  language, 
out  In  order  to  insure  simpler  exucutlon  and 
setter  performance.  soma  ncJificotions  at-' 
introduced  into  thia  language  as  follows, 

(1)  Main  prograa  proceoded  oy  the  suyword 
"Master"  should  be  put  at  the  enu.  for 
interaction  between  man  and  machine  input¬ 
ting  la  carried  out  toxun  uy  tOMjai  the 
main  program  should  bo  put  nt  thu  front  of 
thu  sholo  program,  but  ot  that  time  the 
j 'inula  of  each  progrnn,  unit  should  be  li¬ 


mited  to  a  certain  number. 

(2)  The  tyjiea  of  all  the  variables  and  arrays 
should  be  declared  explicitly,  especially 
the  dummy  argument*  of  tha  statement 
lunation. 

( x )  in  order  to  uistinguah  between  tha  uon- 
usecutnble  nnd  executnol*  statements,  a 
Kuy.voiM  "Had HO"  is  placed  nt  the  beginning 
ol  unch  etatumunt  funotion.  The  dummy  argu¬ 
ments  of  utatemant  function  dra  loonllzad 
in  tha  prograa  unit  of  this  statement 
iunction. 

( 4 )  fhe  o^UIVAi-bJIOA  statement  is  dslutad. 

The  architecture  of  the  direct-execution 
fortran  computer  is  doeorlbed  briolly  in  aeotlon 
11  of  this  paper.  The  dirmot-axaoutlon  proce¬ 
dure*  of  some  typical  Statements  of  tbs  language 
i ortren  are  discussed  In  aeotlon  III.  We  believe 
that  may  furnish  an  outllna  of  ths  eork  of  this 
uf rect-executlon  forssran  computer,  finally  soma 
comments  are  maue  on  ths  possible  development 
of  the  ufrec t-execution  high-level  language 
computer  in  auction  IV. 

II.  Architecture 

The  direct-execution  high-level  language 
computer  should  exaout#  tha  program  written  In 
this  language  direotly  according  to  it*  lexicon, 
syntax  and  aemnntios  without  using  the  tradi¬ 
tional  and  complicate  multilayer  software,  (such 
as  compilers,  assembler*,  loaders,  ate).  Thus, 
itn  architecture  should  rsfleot  the  structure* 
of  lexicon,  control  and  Data  of  this  high-level 
language,  so  that  tha  program  written  in  this 
language  moy  be  treated  more  efficiently. 

The  computer  architecture  diagram  proposed 
i*  shown  in  fig.  1.  It  consists  of  a  Program 
Memory  Hri  (to  store  the  user's  program),  a  Data 
Memory  PM  (to  store  the  relevant  dots)  and  four 
processors  (Input/output  processor  l/o  P. 
lexical  rrooessor  LP,  Control  lVocassor  CP  and 
jntn  processor  DP).  Among  the  processors  tharo 
are  also  the  control  but,  tho  address  bus,  the 
data  uus  and  some  registers  to  store  information 


o 


temporarily.  The**  processor*  any  b*  microp¬ 
rocessor*  or  built  up  with  LSI  ship*.  They  may 
oporat*p«jf»llwlly  and  synchronously  with  waoh 
other  la  order  to  laorwaaw  tbw  prooaaaiog  apwed. 

The  uaar's  program  may  be  Input  Into  the  at 
either  all  at  onoe,  or  token  bp  token,  sxecu- 
tlng  and  atorlag  simultaneously  to  allow  iota- 
raotloa  between  ana  and  Machine.  After  traatmant 
by  t/0  P  the  uaer'a  program  la  Input  into  the 
Hi  In  a  deflate  for*:  namely  with  a  terminal 
character  at  tba  end  of  aaoh  statement  and  two 
lagging  cbaraotara  one  at  the  beginning  of  each 
progsam  unit  ana  the  other  at  the  eu  of  the 
wt^ole  program.  Th***  tagging  oharaotwro  era 
called  unit  heada  and  prograa  and  obaractera 
(respectively,  they  are  In  the  flrat  poaltroa  of 
;  tbe  label  region  and  are  different  fro*  any 
Ordinary  obaractera  ueed  by  Fortran.  The  oodea 
atored  in  tba  IV  nay  be  either  ASCII  or  coa- 
preeeed  internal  eodaa. 

IF  le  ueed  for  lexical  analyela.  It  lnoludea 
the  DAM  (Soaaner  Aaaoolatiwa  it**  ry  which  atorea 
legal  oheraotera,  ete^  There  arc  too  working 
.mod**  for  LP  controlled  by  OF:  ccannlng  and 
executing.  In  the  ccannlng  *ode,  LF  checka  the 
charaotere  cant  fro*  IV  whether  there  la  a 
twralnal  character  or  not,  ao  aa  to  find  out 
tbe  label  region  (alnce  tba  label  region  la 
Juat  next  to  the  teralnal  ohatacter.)  After  LF 
flnda  out  the  label,  the  unit  head  tag  and  tha 
Character  "D»  at  the  flrat  poaltlon  of  tha 
atateawnt,  lp  la  tranafarad  to  tha  oxeoutlng 
JMiodo.  In  tha  executing  aoda,  It  chccka  the 
riegality  of  cbaraotara  aent  fro*  M,  Spell  a 
3fbea  Into  tokena  and  aandc  the*  to  CP  and/or 

If  the  tokana  are  a  airing  of  nuabera ,  act 
■mm  re«Aalar  to  •!«,  aake  ao*e  eodvaraiaa  and 
put  the  concerted  eodaa  Into  tha  VALUai  reglatar 
and  than  aand  tha*  out.  %• 

CP  oonaiata  of  tho  CAMU  (Unit  head  Control 
AaaoclatiTo  IfiMory),  tha  CAUL  (Label  control 
ACeoclativ*  Memory),  the  GAUD  (haaarvad  Word 
Control  Aaaoolatiwa  Memory),  tha  H  Stack  (Saturn 
staok],  tha  no  Staok,  tu  nArn.  staok  and  tha 
MDUUn  (Mode  of  Dp  gad  LF  Seglater).  CP  la  the 
control  cantor  of  thla  ooaputar.  Vhan  tha  main 
progra*  la  axaoutad  or  tha  auhprogra*  la  oallao 
it  aata  op  Into  tha  operating  soda  by  aeaaa  of 
tha  raglator  MELRJO.  Otharwlae  it  say  aat  op 
Into  tba  ayntax  aoda.  Tha  corking  eodaa  of  LP 
era  alao  aat  by  aeaaa  of  tba  roglatar  guuuc. 

S  Staok  la  uaad  for  raaorvlag  tha  return  poal¬ 
tlon.  DO  Stack  1c  uaad  for  reaerrlog  the  Do 
otataawnt  Information  and  CALL  Staok  for  recor¬ 
ding  relevant  information  of  local  quaatlti** 
whan  a  call  auhprogra*  la  executed.  /Than  LP 
outputa  a  unit  hand  or  a  label,  CP  ehould  fill 
tha  antrlaa  or  CAMU  and  CAML  raapoctlvuly  for 
i%ta»  lira  In  aoaa  control  stetementa  concerned. 

DP  oonaiata  of  DAM  (Date  Associative  Uaaory), 
•o*e  ctaoka  (UP  Stack,  f  Stock,  L  Staox  and  F 
Stcok)  end  rcglctcr  VSMFT.  In  the  ayntex  aodc 
for  non-axacutabla  atateaanta  (dcolaratloo  part) 


It  fill#  tha  corresponding  entries  of  DAM  for 
tha  variables  and  arrays  but  does  not  allocate 
any  oalla  in  EM*  (except  cOi.iiOM  Statements) . 
for  executable  abatements  no  treatment  should 
aa  necessary;  it's  a  matter  of  starting  tue  lP 
by  OF  to  continue  the  scanning.  Now  ae  Lp  la  in 
the  operating  node,  it  not  only  fills  tha  cor¬ 
responding  entries  of  DAM,  but  aleo  allocates 
calls  in  DM  for  the*.  Then  It  oaloulatea  the 
▼slues  of  *i*ou»abl*  statements  and  assigns 
▼slues  to  the*.  The  register  ftotPT  points  to 
tha  first  usable  location  of  tha  free  space  in 
DM.  (After  returning  of  the  called  subprogram 
tns  apace  m  EM  allocated  to  It  should  be  re¬ 
leased  for  other  uccc.)  UP  Stack  storoa  op- 
rctora.  V  Staok  atorea  tha  values  of  operands. 

L  Stuck  store*  to*  logical  operand*. 

uaaidoa,  DM  is  tn*  Data  mmorri  data  stored 
in  It  ar*  tagged  to  iadiocto  the  type  of  data. 
Scaancr  pointer  SP  la  a  pointer  which  points  to 
tba  location  of  tba  oharactor  being  treated  in 
IV.  Tha  MULT  roglatar  atorws  the  operating 
results  to  control  the  DO  and  If  statement*. 

111.  Dlgaat-exopution  of  some  statements 

Before  exeeuting  we  aesusM  that  the  user's 
program  la  stored  In  PM  already.  The  pointer  sp 
points  to  the  flrat  ohareoter  of  tbe  program  in 
PM  and  LF  la  In  tba  aoannlag  «ode . 

1.  Treatment  of  unit  head  statements 

Whan  LP  so ana  the  unit  head  tag,  it  le 
changed  to  the  axeouting  modu,  spelling  the 
characters  Into  tokena  to  bw  output.  CP  receive* 
and  analyxea  the  token*  to  determine  the  type, 
class  and  nan*  of  this  progra*  unit.  Then  It 
tills  thae*  item*  into  the  corresponding  field* 
o,f  global  CiMD  aa  shown  In  Fig.  2,  where  DAMPT 
point*  to  tb*  flrct  location  of  the  loocl  quan¬ 
tities  in  DAM.  IMFT  points  to  tb*  first  charac¬ 
ter  loectlon  of  tb*  first  executable  statement 
of  this  unit  in  W,  LPT  points  to  the  location 
of  tb*  flr*t  label  of  thi*  unit  In  CAML.  Tha**" 
pointers  should  b*  filled  before  th*  unit  is 
called. 

If  this  unit  Is  a  funotioo  subprogram,  dp 
should  be  activated  to  fill  It*  name  Into  the 
DAM  of  thl*  unit  as  shown  in  Pig.  3.  Thwn  tha 
location  of  th*  first  entry  in  DAM  of  this  unit 
should  be  put  Into  th*  field  DAMPT  In  CAMU. 
Finally,  there  should  be  left  a  blank  between 
the  two  nclgborlng  unite  In  DAM,  CAML  eto,  to 
indicate  tha  end  of  th*  unit*. 

For  tb*  subprogram  with  dummy  argument*, 
after  CP  reeogniaee  n  dujmy  argument,  It  acti¬ 
vates  DF  to  fill  the  entries  of  tbi*  block 
auceaelvely  and  put  o  dummy  symbol  in  the  f}*)d 
Dummy.  When  It  encounters  tb*  character  ")«  It 
fills  the  number  of  dummy  argument*  in  the  field 

na. 
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2.  Treatment  of  daolaraiion  atatanants 

,;h*n  CP  encounter*  tb*  naaa*  of  variable*  or 
array*  of  tha  non-*x*out*bl*  statement#,  it  put* 
thaw  ioto  tba  DSR  (Data  Search  Hatflatar)  and 
than  activate*  DP  to  find  out  abathar  thara  ara 
such  names  in  DAM  or  not.  If  thara  arO|  DP  fill* 
tb*  corresponding  fialda.  If  thara  aran't  it 
•llotfatsa'na*  antrlaa. 

Tha  fiald  3TRUCTORI  lndicataa  that  tha 
atruotura  of  tha  naaa  ia  a  variable,  an  array  or 
a  function.  Tha  fiald  T*Fb  indieataa  that  ita 
ty pa  ia  a  raal  or  an  integer;  tha  fiald  COttKM 
indieataa  whether  it  la  alloeatad  in  tha  cdMCN 
ration  or  not;  and  tba  fiald  SXZS  indioataa  tha 
volume  of  array  or  tba  nuafeer  of  tha  duaoy 
arguments.  Bealdea,  tha  field  171  pointa  to  tha 
location  of  tha  variable  (or  tha  looatlon  of 
the  first  coupon* at  of  the  array)  atorad  in  tha 
DM.  In  our  eehase,  for  all  tha  quantities  of 
non-exeoutabl*  statement*  except  thoaa  apaei- 
fiad  by  tha  CCMtOb  atataaanta,  »a  do  not  allo¬ 
cate  any  cell*  in  at,  that  ia  to  any,  w#  do  not 
fill  TT1  until  thla  unit  la  oallad  by  another 
program  unit.  Of  eourae,  for  these  quantities 
in  tha  aaln  program  cell*  are  allocated.  The 
pointer  PT2  atoree  tha  inforaatlon  for  calcu¬ 
lating  tha  location  of  tba  oompouenta  of  an 
array,  so  it  la  f Iliad  for  array*  only. 

}.  Treatment  of  tha  atataaiant  function 

Tha  treatment  of  tha  atataomt  function  ia 
to  fill  ita  noma  and  dummy  arguaanta  together 
aith  their  typaa  into  tha  Din  but  not  to  allo¬ 
cate  any  call*  in  tha  DM.  *han  DP  ancountera 
the  token".",  tha  content  of  SP  ehould  ha 
filled  in  tha  fiald  ft)  in  the  DiM  and  than  OP 
aata  u>  to  naan  until  tha  terminal  ayabol  of 
thla  atataaant  la  ancountarad. 

4.  Treatment  of  tha  Dafinltional  Label 

Before  encountering  tha  flrat  executable  atat 
atataaant  of  the  aaln  program,  DP  la  in  tha 
syntax  node,  i.e.  it  only  treata  the  noa- 
•xeotible  atataaanta  ea  dleeuaaed  above  ehil* 
for  tha  executable  atataaanta  it  treat*  tha 
definitional  label  only. 

The  treatment  of  the  deflnltioual  label  1* 
to  fill  the  label  la  the  entry  of  the  CAM L  of 
ita  unit  according  to  the  eaquanc#  If  ita 
appearing  in  the  prograa  aa  ahovn  in  Pig. 4, 

■bar*  pMFT  (Prograa  Memory  pointer)  pointa  to 
tha  looatlon  of  tba  firet  character  of  the 
statement  of  thla  label  in  M.  DL  indieataa  the 
number  of  naating  of  DO  Loop*  and  ia  uaad  for 
preventing  the  program  to  tranafer  into  inner 
layer  of  tha  loop*. 

Mr  tha  axaoutabl*  atataaant*  of  the  eubpro- 
graa,  LP  la  a*t  lh  -th*  acanning  mod*  to  acan  tha 
label  region  and  tba  flrat  character  of  tba 
atataaant.  Now  If  tba  label  ia  encountered,  CP 
filli  on*  entry  of  CAML  and  "0"  in  ita  ut  fiald 
to  iodicat*  that  tha  label  ia  not  in  any  loop 


body. 

If  tho  flrat  charactor  of  tha  atataaant  la 
not  the  alphabet  "d",  than  tha  i?  ahould  acan 
contlndoualy.  If  It  la,  tha  IP  ahould  output 
a  token.  Phan  CP  doaa  not  gggaumt ar  the  keyword 
"DO",  It  aata  LP  to  acannlag  mod*  again;  if  it 
doaa  encounter  "DO*  the  following  statement 
ahould  be  t  DO  atataaant  such. as: 

Do  L  1  -  ,  a*,  , 

Then  OP  pushes  tba  label  of  tha  terminal  atata¬ 
aant  i  into  tb*  field  XL  of  tha  Do  staok 
(remains  tha  other  fields  blank)  and  pushes  tha 
return  location  (l.s.th*  flrat  character  of  th# 
loop  body)  into  tha  R  staok  aa  shown  in  rig. 5* 

shea  a  dsflnitloaal  labsl  is  sncountsrsd  CP 
fills  an  satry  of  CAML  and  *0"  into  ita  field 
DL  as  wall,  then  tha  label  of  tha  terminal 
statement  t  i a  anoountarad,  bssldaa  filling  it 
into  tn*  CAML,  the  DO  stack  and  ft  stack  should 
b*  popped,  tha  values  of  tba  DL  of  all  labaia  ; 
within  this  DO  loop  ahould  b*  increased  by  "1", 

yor  multl-na*t*d  DO  Loops,  say,  with  thra* 
nested  layers,  uftar  SP  transfers  out  of  all  the 
DO  loops  tba  DL  value  of  tha  innermost  layer  la 
J,  that  of  the  middle  layer  la  2  mad  that  of  tha 
outermost  layer  la  1.  To*  treatment  of  defini¬ 
tional  labaia  in  the  main  program  will  b*  dis¬ 
cussed  in  the  next  paragraph. 

).  Treatment  of  DO  statements. 

•ban  DP  la  in  the  operating  mode  to  axacuto 
tho  do  statement  "Do  Li.  su,  m,,  a.",  if  tha 
faralaul  statement  label  L  ia  found  in  tho  CAML, 
the  HOT  Of  L  it  puahad  Into  tho  field  TL  or  tho 
DO  ataek,  m,  1*  assigned  to  1,  and  the  location* 
of  i,  a.  and  a.  are  puahad  into  tha  fialda  (|r 
(Control  variable),  IF  (Pinal  parameter)  and  IP 
(Incremental  imramster)  of  DO  staok  raopactlvoly. 
The  return  location  la  puahad  into  R  ataok 
(ftoturn  at  act)  alao,  ao  shown  in  Plf.J.  Label* 
■hlcb  are  found  la  tb*  CAML  with  address**  both 
lass  than  or  equal  to  that  of  tba  terminal  label 
l  and  greater  than  or  equal  to  that  of  tb*  R 
stack  ar*  within  tha  Loop  body.  Than  tba  values 
of  tha  field  DL  of  ell  the  labaia  within  the  loop 
body  ahould  b*  dacraaaad  by  "1".  Tha  values  of 
DL  of  tbi*  layer  now  equals  to  ”0",  which  indi¬ 
cate*  that  tbas*  labaia  ar*  in  tb*  same  layer, 
so  that  they  say  b*  tranafarad.  Whan  tha  nested 
DO  statement  ia  anoountarad  CP  goes  through  all 
tb*  procedure  as  dlacussad  above.  Having  executed 
tb*  tarainal  atataaant  of  tho  DO  loop.  OP  aotiva- 
too  dp  to  calculato  1  ■  1  ♦  a-  and  RggOLT  •  l  - 
a,  .  If  RESULT*  0,  *  ban  tb*  tbp  ltaaa  of  R  Staok 
ahould  b*  oopisd  into  sp  to  aako  tho  loop  execu¬ 
tion  again,  if  tho  HISWLTwO,  tha  loop  oxaeution 
i*  completed,  tb*  value*  of  tho  field  DL  ef  all 
tbs  labels  within  tb*  loop  body  should  b* 
increased  by  ”1",  It  indieataa  that  tbeee  labaia 
witfela  thla  loop  ahould  not  bo  tranaforad.  Than 
tha  DO  ataek  and  R  staok  ahould  b*  popped. 
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If  L  is  not  found  In  CAML  ths  statement* 

In  tbs  loop  body  should  ba  executed .  Whan  tha 
definitional  labals  ara  encountered,  CP  allo¬ 
cates  the  entries  of  CAUL  to  them  and  fills  the 
fieldn  OL  "lth  "0" •  Hating  executed  ths  tsrmlnel 
atatemsnt  of  the  DO  loop,  treat  them  as  dlacuar 
aed  abore. 

6.  Treatment  of  GOTO  and  IF  statements 

If  tha  coTO  L  statement  la  not  in  any  DO 
Loop,  and  the  labal  L  la  found  In  the  corres¬ 
ponding  region  of  CAML,  CP  checks  the  talus 
of  the  field  DL;  If  It  la  "0",  the  program  may 
transfer  to  the  label  L,  others lie ,  an  error 
has  occurred.  If  the  label  L  la  not  found  in 
the  corresponding  region  of  CAML,  CP  sets  LP 
and  DP  into  scanning  and  syntax  modes  respec¬ 
tively,  scanning  tha  program  to  find  tha  L.  The 
treatment  le  elmllar  to  paragraph  4. 

Whan  tha  L  la  found  in  tha  program,  In  orr 
der  to  prevant  tranafer  Into  tha  Inner  layer 
of  do  loop  from  tha  out  layar,  tha  L  should  not 
be  transferred  Immediately  (although  the  value 
of  its  f is  Ids  DL  is  "0"  at  that  tlma),  Tha 
location  of  L  in  CAML  should  be  atored  in  the 
temorery  register  TH.  LP  scene  the  program 
contlnuosly  until  It  raturna  to  the  same  layer 
of  this  00T0  L  statement,  l.e.  the  DC  loop 
layer  shoes  fields  OT,  FP  and  11  in  ths  DO 
stack  are  blank  should  ba  scanned  out  uuu  tha 
values  of  field  DL  should  be  iucroasud.  Tnen 
ths  values  of  ths  field  H(FT  and  DL  in  CAMU 
should  bs  found  out  by  msaaa  of  the  content  In 
TH,  If  the  value  of  ths  field  DL  Is  "o"  then 
the  program  transfers  to  Li  otherwise  an  error 
has  oaaurred. 

For  ths  0010  L  statement  lying  in  a  certain 
naatldg  DO  loop  layer,  It  le  neoessery  to  find 
out  L  within  the  current  nested  layer  of  CAML. 

(If  it  Is  not  found  in  CAML  LP  should  bs  sat 
in  the  scanning  mode  *0  MUh  the  prbgrW 
to  find  the  L  of  this  DO  loop  layer  in  the 
program  ae  discussed  above).  If  L  Is  found  nod 
its  DL  value  Is  "0",  the  program  should  be 
transferred  to  L.  If  L  Is  not  found,  the  values 
of  the  field  DL  of  ell  the  labels  within  this 
loop  layer  should  be  increased  by  "l".  CP  pops 
the  top  of  DO  stack  and  H  stack;  goes  on  to 
find  the  label  in  the  outer  layur  (In  the  CAML 
or  In  the  program).  The  above  process  Is  re¬ 
peated  again  and  again  until  the  L  Is  found 
and  the  program  Is  transferred  to  it. 

The  execution  of  IF  statement  IF  (e)  L,, 

Lj ,  L,  le  elmllar  to  tha  GOTO  statement.  When 
Or  refiognizee  the  keyword  "IF"  it  aetlvatee 
DP  to  calculate  tbs  expression  end  puts  its 
logloal  result  (lees  than,  equal  to  or  graater 
than  zero)  Into  the  HU3ULT  reglcter.  Then 
according  to  this  result  CP  puts  tha  valua  of 
tha  corresponding  HtPT  of  L.,  L  or  L  Into 
sp  to  parform  tha  transfer. 1  * 

7.  Treatment  of  the  cell  of  statement  function 
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In  the  case  of  calling  a  function  subprogram 
or  a  atatemsnt  function,  it  is  necessary  to  find 
tha  nouns  of  the  function  in  the  region  of  the 
current  operating  program  unit  of  ths  DAM.  If 
this  asms  is  found  it  is  a  statement  function; 
otherwise,  it  may  be  e  funotlon  subprogram.  For 
a  function  subprogram,  its  name  sould  be  found 
out  in  the  CAMU* 

CP  copies  the  values  in  the  flels  rvPT, 

DAMPT  and  LPT  of  the  CAMU  Into  the  fields  of 
tomporery  register.  CP  allooatee  a  osll  in  DM 
for  the  function  name  to  atore  values  of  the 
function.  Then  the  CP  reoogalzes  the  aotual 
arguments,  say,  there  are  three  arguments:  a 
(variable),  3  (constant)  and  C  ♦  D  (expression) . 
CP  aetlvatee  DP  to  find  out  (by  DAM)  the  loca¬ 
tion  in  DM  allocated  for  A*  The  locations  of  a 
and  thoaa  of  temporary  cells  allocated  for 
constant  3  and  the  result  of  expression  (c  ♦  D) 
together  with  the  location  of  the  funotlon  nnem 
should  be  copied  into  the  fields  Prl  of  dummy 
arguments  and  the  function  acme  of  the  called 
subprogram  in  DAM  respectively.  Then  DP  pushes 
the  value  of  the  field  PTl  of  the  function  name 
into  P  staok,  ns  shown  in  Fig. 6. 

la  our  scheme  we  use  call  by  name.  Certainly 
curing  the  process  of  substitution  some  syntax 
checking  (ue  on  whether  the  numoere  of  the 
actuul  anu  dummy  nrgumanta  are  equal,  whether 
the  types  of  both  arguments  are  the  earns,  etc.) 
should  be  made.  When  the  character  ")"  hne  been 
treated,  the  return  location  in  111  (the  value 
of  sp)  should  be  pushed  Into  R  stack,  the  valuea 
of  DAmPT  ami  LPT  of  TH  end  thr,«  of  FDfXPT  pushed 
into  the  nsl.t.  staok  as  shown  In  Flg.$.  The  value 
of  MtPT  of  CAMU  In  the  temporary  register  should 
be  put  into  the  pointer  OF  to  perform  the  trans¬ 
fer.  In  exeeutlng  the  executable  statements  of 
the  funotlon  subprogram,  DP  should  allooate 
cells  In  EM  for  local  variables  which  have  not 
been  allocated  yet,  and  should  modify  FDMFT  also 
.  onoe  the  (UOTHN  statement  Is  encountered  CF 
puts  the  value  of  R  stack  Into  9P,  pope  the  CALL 
stack  and  R  stack  and  claara  all  tbs  fields  PTl 
of  the  DAM  of  this  subprogram.  The  calculated 
result  is  now  automatically  available  In  the 
cell  of  the  function  name. 

The  cell  of  a  statemant  funotlon  Is  very 
ui-illler  to  that  of  the  function  subprogram  but 
the  value  of  IT3  of  ths  statement  f auction  In 
the  DAM  should  be  put  into  SP  Instead  of  lliPT 
in  CAMU  of  the  function  subprogram,  a*  the  same 
time,  it  ia  not  necessary  to  alter  the  CALL 
Stack. 

Sinoe  the  cell  of  e  subroutine  statement 
is  preceeded  by  tbs  keyword  "CALL",  it  is 
easier  to  recognize.  Thw  treatment  of  the  cell 
subroutine  is  rather  similar  to  that  of  e 
function  subprogram. 


IV.  Conclusion 


Tbs  language  Fortran  has  bssn  in  uss  for 
many  ysars  in  scisntlfic  confutation  and  is 
familiar  to  aost  oosiputsr  users,  dines  however, 
uulte  a  lot  of  trouble  is  involved  in  the  uss 
of  the  language  Fortran  for  direct-executing 
se  have  to  Modify  it  properly  in  designing  the 
high-level  language  computer. 

Today,  the  computer  hardware  and  the  compu¬ 
ter  software  hold  a  relation  of  mutual  impetus, 
mutual  penetration  and  mutual  constraint.  The 
development  of  the  computer  languors  aod  prof, 
gr  earning  has  greatly  affected  computer  archi¬ 
tecture,  as  is  shown  in  the  improvement  from 
classical  eomputer  architecture  to  high-level 
language  computer  architecture.  On  the  other 
hand,  the  development  of  computer  orehlteeture 
also  leads  to  a  development  of  languages,  suoh 
as  the  him  language  proposed  by  t-rof .  Yaohan 
Chu. 

To  aum  up,  the  development  of  the  high-lev 
level  language  computer  anould  lead  to  a  close 
merging  of  the  programming  language  and  the 
computer  architecture;  that  la,  the  language 
and  the  computer  architecture)  ought  to  execute 
the  program  effectively  and  the  programming 
also  ought  to  satiafy  the  re<iuirmenta  of  the 
language  and  architecture,  so  ee  to  improve  the 
reliability  and  practicality  aa  well  as  the 
cost-effiolenoy  of  the  whole  system.  So  in  the 
long  run.  It  is  neeeseery  tc  reconsider  end 
redesign  new  language  from  the  point  of  view 
of  programming  and  computer  architecture. 

Indeed  the  conception  of  structure  programming 
ana  structured  language  has  appeared  already, 
but  the  languages  evolved  are  not  solely  dedi¬ 
cated  to  the  high-level  eomputer 

Since  the  said  HIM  language  has  not  won 
wide  acceptance  yet,  we  think  it.  necessary 
to  design  soma  naw  computer  ar r  .cture  for 

the  currant  language  suoh  as  F<  an,  Cobol 
etc,  This  is  our  motivation  in  'iting  this 
paper. 
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Abstract 

This  paper  reports  a  JOVIAL  direct- 
execution  machine  which  accepts  a  subset  of 
the  JOVIAL  J71  language.  It  desctlhes  the 
J73  subsat.  It  also  describes  the  organization 
of  the  JOVIAL  direct-execution  computer  which 
reflects  the  language  constructs  of  the  J73, 
Therc  are  3  processors,  3  associative  memories, 
a  program  memory,  a  data  memory  and  10  Inter¬ 
facing  registers.  The  musorv/reg I stcr/stack 
structures  and  direct-execution  algorltlma 
of  the  processor  are  described . 


1.  Direct  Execution  Computer 

Direct-execution  refers  to  the  operating 
mode  of  a  high-level  architecture,  trim  opera t  Inn 
mode  directly  accepts  and  executes  a  high-level 
language  program  without  the  need  of  multiple 
layers  of  conventional,  software.  As  a  result, 
there  is  no  compiler,  no  assembler,  and  no 
linkage  editor.  The  high-level  programming 
language  1b  the  machine  language  that  the  bare 
hardware  recognizes.  A  direct  execution  computer 
Is  capable  of  operating  in  the  dlrect-exeijutlon 
mode. 

The  direct-execution  computer  (oj  Is 
structured  with  a  direct  execution  cvclei  this 
is  shown  in  Fig.  1.  A  hiRh-order  language 
Program  is  stored  in  the  program  memory.  The 
lexical  proceaaor  fetches  the  next  token  from 
the  program  mamory  and  delivers  the  token  to 
the  language  processor;  the  language  processor 
executes  the  token  accordingly.  This  cycle 
contlni.es  until  the  program  ends. 

# 

I'he  direct-execution  computer  11,7,8]  is 
organized  to  reflect  the  constructs  of  a  high- 
level  programming  language.  The  organization 
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is  shown  in  Fig.  2,  where  there  are:  a  program 
memory  PM,  a  data  memory  DM,  three  aaaoclative 
memories  (SAM,  PAM,  and  "AM),  and  three  processors 
Mex  leal  processor  I.P,  data  proceaaor  DP,  and 
control  processor  DP),  The  program  mamory  atorea 
the  source  program.  The  data  memory  stores  the 
data  values.  The  associative  manories  store 
descriptors  which  represent  the  data  and  control 
Information  In  the  source  program.  After  initial¬ 
ization,  the  control  processor  fetches  the  next 
token  from  the.  lexical  processor  which  has  access 
to  the  program  memory.  It  then  either  executes 
the  token  or  activates  the  data  proceaaor  to 
execute  it.  This  process  of  direct-execution 
token-by-token  continues  until  the  source  program 
reaches  the  end. 

This  paper  describes  a  JOVIAL  direct-execution 
computer,  wjilch  makes  use  of  the  above-mentioned 
direct-execution  organization. 

2.  A  JOVIAL  Machine 

The  JOVIAL  computer  in  this  paper  is  designed 
for  n  subset  of  the  revised  MIL-STD-1589A  (DSAF) 
definition  of  the  upgraded  J73  JOVIAL  programming 
language  dated  MARCH  15,  1979  [11]. 

2.1  A  J73  Subset 

Thr  .173  is  u  dialect  and  an  outgrowth  of  the 
AT, COL  60  programming  language  [10].  As  a  result, 
the  J73  retains  a  great  deal  of  the  ALGOL  60 
language.  It  is  a  complex  compiler-oriented 
language.  A  subset  of  J73  is  chosen.  There  are 
46  syntactical  statements.  The  syntactical 
constructs  are  outlined  below. 

(a)  Program  Structure 

The  subset  allows  the  complete  program  to 
have  a  main-program  module  and  zero  or  aiore 
procedure  modules.  Tha  main  program  module  must 
he  the  first  module.  Its  construct  is  shown  below, 

START  PROGRAM  <  name  >;  <  program  body  >  TERM 

The  construct  of  the  program  body  is  the  same  a* 
the  procedure  body  except  the  former  permits 

d  irect ives. 


Cb)  Declaration* 

There  are  four  types  of  declarations:  item, 
table,  external,  and  define.  The  first  two  declare 
th*  data  elements,  while  the  third  declares  a 
procedure  module.  The  last  Is  a  macro  for  text 
substitution.  The  declarations  may  be  enclosed 
by  a  pair  of  'BEGIN'  and  'END'  to  become  a  block 
declaration. 

(c)  Procedures  and  Function* 

There  is  both  procedure  declaration  and 
procedure  definition.  The  procedure  fleclaration 
is  for  uss  a  the  external  declaration.  When  a 
procedure  definition  is  enclosed  by  a  pair  of 
$aa*rvad  words  'START*  and  'TKRH',  it  becomes  a 
^procedure  module.  It  permit*  forMl  parameters. 
Thera  is  on*  function  'FLOAT  (<nunb*r>)'  which 
Converts  an  integer  into  a  floating  number. 

(d)  Statements 

A  statement  can  be  simple  or  compound.  Thera 
are  four  types  of  simple  statements:  assignment, 
Joop  (or  FOR-atatement) ,  IF,  and  procedure-call. 
.Statements  may  be  enclosed  by  a  pair  of  'BEGIN' 
and  'END'  to  become  a  compound  atatamant. 

(e)  Formulas 

There  are  three  types:  integer,  floating, 
and  boolean.  An  Integer  formula  represents  an 
integer,  while  a  floating  formula  represents  a 
floating-point  number .  There  are  four  operators 
and  '/')  for  both  integer  and 
floating-point  operations.  A  boolean  formula 
represent a  a  value  of  true  or  false.  There  ere 
nix  relational  operators. 

(0  Data  References 

There  are  two  types  of  data  references: 

•  n.ialijc  and  function-calls.  A  variable  can  be 
an  item  or  a  table.  As  mentioned,  there  is  only 
one  Intrinsic  function. 

ip)  Lexical  Elements 

There  are  56  characters  which  are  grouped 
Into  26  letters,  10  diglta,  and  20  marks.  There 
urn  6  basic  lexical  elements:  token,  comment, 
define,  and  trace.  A  token  can  ba  a  name,  a 
number,  a  floating-literal,  or  an  operator.  There 
are  34  operators  which  include  IS  reserved  words. 

A  comment  ie  a  string  of  characters  enclosed  by 
a  pair  of  quotation  marks.  A  define  has  as  its 
body  also  a  string  of  characters  enclosed  by  a 
pair  of  quotation  marks.  Only  one  directive  la 
permitted;  this  directive  has  as  its  body  a  series 
of  name*  separated  by  commas. 

2.2  A  Sample  Program 

A  J73  sample  program  la  shown  in  Fig.  3.  The 
line  numbers  are  not  a  part  of  the  progrem;  they 
are  used  for  references.  The  numbers  are  the  same 
for  the  two  parts  of  th*  program;  they  are 


distinguished  by  being  referred  to  as  upper  lines 
end  lower  lines. 

The  upper  lines  00000  to  02000  indicate  the 
start  and  ths  termination  of  the  complete  program. 

The  complete  program  consists  of  a  main  program 
module  and  a  procedure  module.  The  main  program 
awdula  consists  of  program  name  TSIJ0V  (upper  line 
00100) ,  program  body  (lines  00200  to  01900) .  The 
program  body  ha*  an  external  declaration  (upper 
line  00300)  of  Proo  TRIG  which  includes  a  block 
declaration  of  iteu  ANC,  SANG,  and  CANG  (upper 
lines  00400  to  00800) ,  three  declarations  of  tables 
DEG,  SS1N,  and  CC0S  (upper  line*  00900  to  01100), 
a  declaration  of  item  II  (upper  line  01200),  a 
directive  of  TRACE  (upper  line  01300) ,  and  a  FOR 
atatamant  (upper  linaa  01400  to  01900) ,  In  this 
FOR  atatamant,  there  ie  a  call  of  procedure  TRIG 
(line  01700), 

The  procedure  module  beglna  and  terminates 
at  the  lower  llnee  00000  end  02200,  reapectlvely . 

It  haa  e  procedure  heeding  (lower  line  00100)  and 
e  procedure  body  (lower  llnee  00200  to  02100).  The 
procedure  body  haa  e  define  declaration  (lower 
line  00200) ,  declarations  of  elx  items  (lower  lines 
00300  to  00800),  e  comment  (lower  lines  00900  to 
01000),  three  assignment  statements  (.lower  lines 
01300  to  01500),  and  a  FOR  statement.  The  controlled 
statement  of  the  FOR  statement  is  a  compound  state¬ 
ment  which  consists  of  an  assignment  statement  and 
un  IF  statement. 

2.3  Computer  Organisation 

Th*  organisation  of  JOVIAL  direct-execution 
computer  (13)  Is  developed  from  the  direct-execution 
computer  organization  in  Fig.  2.  it  is  shown  in 
tho  diagram  in  Fig.  4  where  there  are  the  following 
computer  elements: 

(a)  3  processors:  LP,  CP,  and  DP, 

(b)  3  associative  memories:  SAM,  CAM,  and  DAM 

(c)  2  random  acceBs  memories:  PM  and  DM 

(d)  2  cables  in  ROM, 

(e)  10  interfacing  registers,  and 

(f)  main  bus 

The  memory/reglster/stack  atructurea  of  the 
3  proceaaore  together  with  the  interfacing  registeru 
are  shown  in  Fig.  5.  The  functions  of  these  10 
interfacing  registers  are  described  below. 

(a)  Register  SPTR  points  to  a  character  in  Program 
Memory.  It  la  of  special  Importance  when 
marking  the  location  and  other  unique  pointers 
of  control  statements  end  procedure  modules, 
th*  bodies  of  define  declarations,  end  the 
return  position  from  procedure  and  define  call*. 

Except  for  the  very  first  call  for  a  token, 

SPTR  1*  set  at  the  firet  character  of  th*  next 
token  to  be  formed  when  *  token  la  requested. 
After  the  token  hae  been  formed  by  the  Lexical 
Procassor,  SPTR  la  advanced,  if  necessary ,  to 
point  to  the  first  character  of  the  next  token. 
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(b)  Register  TOKEN  hold*  the  lot  token  formed  by 
the  LP.  Thle  register  Is  referenced  by  nesrly 
every  sequence,  since  the  tokens  define  the 
prog  rest. 

(c)  Register  TYPE  stores  the  type  of  s  nue.  A 
name  asy  be  a  reserved  word  ('R'),  pseudo¬ 
function  cell  ('FLOAT'),  trace-directive  ness 
('DIR')  or  identifier  ('N'). 

(d)  Register  BLOCK  etores  the  top  entry  of  the 
Control  Processor's  BSTACK.  It  Identifies 
the  module  which  the  program  Is  currently 
executing  so  scope  checks  csn  be  made  on 

declared  names. 

(c)  Register  BACK-PTR  saves  the  position  of  the 
first  character  of  the  current  token. 

(f)  Register  D-LEV  stores  the  top  entry  of  -lie 
Lexical  Processor's  RETURN  stack.  (An  empty 
stack  gives  D-LEV  a  value  of  sero.)  This 
value  identifies  a  specific  define  call  or 
that  there  Is  no  active  define  call  (It  can 
be  conaldered  a  'define-act lvst ion-level ') . 

The  register's  purpose  Is  to  protect  the  CF 
from  creating  CAM  entries  for  control 
statements  whose  Program  Memory  Pointers 
are  in  different  define  bodies. 

(g)  Register  DKF-DCL  le  n  flag  which  Identifies 
whether  a  define  declaration  is  permitted  or 
not.  Define  declarations  follow  all  the  rules 
associated  with  other  declarations. 

(h)  Register  DEF-CALL  Is  a  flag  which  identifies 
whether  or  not  a  define  call  la  permitted. 

A  define  call  is  not  allowed  when  the  next 
token  expected  le  a  module  name  or  declaration 
name. 

(I)  Register  PROC-NAME  saves  the  name  of  s  pro¬ 
cedure  when  the  procedure  Is  called.  The 
register  Is  used  to  match  a  procedure  module 
name  on  the  first  call  of  the  procedure,  and 
as  a  switch  to  determine  if  the  procedure 
heading  and  declaration  list  must  be  processed 
(first  time)  or  skipped  over  (second  time). 

(J)  Register  RESULT  holds  the  Information  about  the 
value,  type  and  structure  of  formulas  and 
variables.  It  Is  used  by  the  Data  Processor 

to  calculate  and  pass  values  to  the  Control 
Processor. 

3.  Control  Processor 

The  control  processor  CP  directly  executes 
control  constructs  such  as  conditional  branch, 
procedure  call,  nesting,  and  looping  of  the  J73 
subset.  It  also  creates  and  stores  the  control 
descriptors  In  the  control  associative  memory 
CAM.  These  control  descriptors  can  expedite  the 
repeated  execution  of  stataaencs  In  a  program 
loop  without  the  need  for  repeated  syntactical 
processing. 


A  .173  prograa  specifies  a  sequence  o t  data 
operations.  The  sequencing  la  dpeelfled  by  control 
statements .  The  control  processor  recognises  the 
control  raeervnd  words  nnd  then  manlpulatma  tbs 
pointer  In  register  SET*  (which  polnte  to  the  next 
character  In  execution  of  thn  source  program)  of  the 
LP  proceaaor  to  carry  out  the  aequenclng. 

The  structure  of  the  CP  conelste  of  one 
associativa  memory  end  5  stacks  as  shown  In  Fig.  5. 
The  functions  of  these  memory  end  atacks  are 
described  below. 

(a)  Control  Associative  Memory  CAM  la  to  apaed  up 
statement  execution  by  saving  critical  Infor¬ 
mation  about  control  statements  and  procadura 
modules.  There  art  thraa  typea  of  CAM  entries: 
If-atatenent,  loop-statement,  and  procedure- 
module.  The  type  of  entry  la  stored  In  the 
Type  field.  Information  for  control  statements 
conalata  of  fields  for  the  location  of  the 
statement  (for  identification),  alaa-part 
pointer  for  lf-etatemente ,  Increment  formula 
pointer  for  loop-statement* ,  and  an  exit  pointer 
to  point  to  the  token  following  the  statement. 
Procedure  CAM  entries  store  the  name,  location* 
foraal-perametar-llst  pointer,  and  body  pointer 
of  procedure-module*. 

The  CAM  entries  for  eome  control  statements 
composed  of  deflne-call*  cannot  be  made. 

Unless  the  Prograa  Maswry  Pointers  of  a  control 
statement's  CAM  entry  are  all  on  the  same 
'def lne-actlvatlon  level',  the  proper  ateck 
management  ataps  of  daflne-calle  and  returns 
may  not  be  followed  when  SPTR  la  Jumped.  When 
this  type  of  statement  Is  encountered.  It  will 
always  be  treated  as  a  'first-time',  *o  GPTR 
la  adjusted  by  repeated  cell*  of  sequence  NEXT-  • 
TOKEN. 

(b)  Stack  BSTACK  eaves  the  body  pointer*  of  program- 
body  and  active  procedure  module*.  Each  entry 
uniquely  identifies  a  nodule.  The  top  entry 

of  BSTACK  le  stored  In  register  BLOCK  to 
identify  the  currently  executing  module. 

In  addition,  ths  positions  of  'BECIN's  bsfors 
the  first  slmpls-statenant  of  a  aoduls  body 
are  pushtd  onto  (and  than  popped  off)  the  ateck. 
This  Is  nscessary  bscsuss  both  compound- 
declarations  and  compound-statements  era 
delimited  by  'BEGIN'  and  'END'. 

(c)  Stack  RSTACK  saves  the  return  position  of 
procedure  calls.  The  token  position  following 
an  executed  procedure-call-stetmsent  la  pushed 
onto  the  stack.  It  Is  restored  Into  register 
SPTR  after  ths  procedure-body  1*  executed. 

(d)  Stack  CTR-STACK ' s  top  entry  serve*  at  a  counter 
of  the  'BECIN's  pushed  onto  BSTACK  and  ths 
parameters  In  a  parssMter  list. 
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(•)  Stack  SPTK-STACK  haa  two  field*;  LOCN  holds  the 
location  of  an  actlva  control  atatmaent,  and 
DtSV  holds  the  dafina-activat  ion-level  of  the 
location  pointer.  Before  a  control  statwent's 
PM  pointer  field  Is  assigned  a  value,  the 
current  ectlvatlon-lawel  nuet  equal  that  on 
the  SPTR-STACK.  If  they  are  aver  different  the 
atatanent's  CAM  entry  cannot  ba  kept  -  the 
locution  field  oust  be  erased.  Loop-atata«ente 
must  have  their  Increment  and  exit  pointers  on 
the  same  level.  If  they  are  not,  the  statement 
Is  considered  Illegal. 

The  control  processor  CP  directs  the  control 
flow.  It  activates  both  the  data  processor 
DP  and  the  lexical  processor  LP.  processes 
the  following  control  constructs: 

(a)  Program  structure 

(b)  Procedure  definition 

(c)  External  declarstion 

(d)  Statement 
it)  IP  statement 

(f)  FOR  statement 

(g)  Proc-call  statement 

(h)  Declarations 

In  th#  following  processor  design,  sequence 
NEXT-TOKEN,  which  fetches  the  next  token  from 
the  source  program,  ie  aocecuted  by  processor 
DP  as  will  be  deacrlbed  later. 

3.1  Program  Structure 

The  program  structure  conslets  of  those  control 
constructs  which  form  a  complete  program.  These 
art  shown  below. 

1,  <  complete-program’  s  main-program  modulo 

l<  procedure  module’ . . . ) 

2.  * maln-program-module> : t “START 

PROGRAM 

<  name’  J 

<  program-body’ 

TERM 

3.  x  program  body’  ;i«BEGIN  <  dec 1-1 let’ 

t  <  directive  >. . .  ] 
x  statement’. . . 

END 

4,  <  procedure  module  ’  :t"START 

<  procedure-definition’ 
TERM 

The  above  syntax  calls  for  the  following  hardware 
sequence* , 

01  COMPLETE-PROGRAM 
02  I  HIT 

02  MAIN-PROG RAM-HODULE 
03  PROGRAM  BODY 


relationship  of  these  sequences.  These  sequences 
are  briefly  explained  below. 

(s)  Sequence  COMPLF.TF.-PROGRAM.  This  sequence 

reflects  the  syntax  that  the  complete  program 
has  on*  main-program  module  followed  by  0  or 
more  procedure  modules. 

(b)  Sequence  1NIT,  This  sequ<i:>c»  sets  the  roisters 
to  aero  and  empties  the  stacr... 

(c)  Sequance  MAIN-PROGRAM-MODULE.  This  sequtnee 
Is  identified  by  three  reserved  words  snJ  n 
semicolon  as  follows, 

START  PROGRAM...;...  TERM 

This  sequence  Identifies  the  main  program 
module.  It  calls  sequences  NAME  »r.d  PROGRAM- 
BODY. 

<d)  Sequence  PROGRAM-BODY.  The  pro$ras>  body  is  a 
series  of  0  or  more  declaration*  followed  by 
one  or  more  statements,  enclosed  by  a  BEGIN/ 

END  pair.  The  pres'-nca  or  absence  of  a 
declaration  has  to  be  determined  by  the  first 
token  of  the  declaration. 

(e)  Sequence  CAM-CHECK.  This  sequence  searches  the 
CAM  for  an  entry  whose  name  field  Is  the  same 
as  the  contents  of  *  Agister  TOKEN.  If  It  is 
not  found,  it  returns;  otherwlss,  it  Is  an 
error. 

(f)  P ROC-MODULE 

A  procedure  module  Is  Identified  by  two 
reserved  words  ss  follows. 

START... TERM 

However,  there  ie  no  need  for  sequence  PROC- 
MOD'JLE  eince  eech  will  be  eeerched  end  celled 
as  a  result  of  a  procedure  call. 

3.2  Procedure  Definition 

Th*  procedure  definition  epeclflae  a  procedure 
structure.  The  eyntax  ie  ehown  below. 

5.  * procedure  definition’  it-  «  procedure  heading’; 

<  procedure  body’; 

6.  <  procedure-bead ing’  : PROC  <  name  ’ 

[ <  formal  parameter 
list’] 

7.  x  procedure-body’  :  :■  BEGIN  x  decl-list’ 

[<  statement’, . . 

END 

Th*  above  syntax  call  for  the  following  hardware 
sequ  snees 

03  PR0C-DEF 


‘i 


4 


1 


( 

•i 

5 


02  CAM-CHECK  /‘check  th#  variables  in  CAM*/ 


04  PROC 


.3 


02  NEXT-TOKEN  /‘processed  by  LP*/ 

Moat  of  the  names  of  these  sequences  reflet, 
th#  terminals  or  non-terminals  of  thu  control 
ayntax.  The  level  numbers  Indicate  the  hierarchical 


04  DEF-HEADINC 
04  PROC-BODY 

05  PARA-CHUCKS 
05  PARA-POP 
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Thais  sequences  are  explained  below. 

(a)  Sequence  PROC-DKP.  This  sequence  cells  sequence 
FROC-RKADING  and  than  calla  aaquanea  FROC-BODY. 

(b)  Sequence  PROC-DMP-RRADINC .  Saquanca  PROC-DIP- 
HIADINO  checks  tha  syntax  of  the  procedure- 
heeding  end  aata  tha  paraaeter  pointer  and  body 
pointer  of  tha  procedure' a  CAM  entry. 

(c)  Saquanca  PROC-BODY.  A  procedure  body  la  alallar 
to  a  profraa  body,  except  for  two  epaclal 
eonaidaratlonat  tha  fornal  pa  reset  era  auat  be 
declared,  and  the  declaration  list  la  skipped 
over  after  the  first  call  of  tha  procedure. 

(d)  Sequence  PARA-CHICKS .  This  saquanca  checks 
whether  tha  types  and  structure  of  actual 

and  fornal  pa  res  at  era  agree. 

(s)  Saquanca  PARA-POP.  This  aaquanca  calls  DP- 
aequence  PAAA-RHTOA1  to  pop  each  of  tha 
procedure*  a  paranetars  of  tha  paraaiatar 
stacks  la  tha  BP  and  to  return  Into  result 
of  output  paranetars. 

3.1  External  Declaration 

The  external  declaration  declares  an  external 

procedure.  Tha  eyntax  la  shown  below. 

8.  <  external-declaration*  >1"  REP  <  procedure- 

head  ln(>1 
[  <  declaration*] 

The  above  eyntax  calla  for  tha  following  hardware 

aequanceat 

01  ECTRRHAL-DECL 
02  8CAM-DRCL 
02  P ROC-HEADING 
03  PROC 

03  FP-U8T-CHECK 

These  sequences  are  explained  below. 

(a)  Sequence  EXTERNAL-DECL.  The  external  dec¬ 
laration  la  recognised  fron  the  reserved  word 
'REP'  which  la  then  followed  by  a  call  of 
sequence  PROC-DEct.. 

(b)  Sequence  SCAM-DSCL.  This  sequence  skips  tha 
declarations  of  the  external  procedure's 

paranetars. 

(r)  Sequence  PROC-MtADINC .  This  sequence  identifies 
procedure  nanes  and  their  paranetar  lists. 

(d)  Sequence  PROC.  This  sequence  fetchea  the  proc 
naaa  and  then  searches  for  It  In  tha  CAM.  If 
it  la  not  found.  It  creates  a  CAM  entry  for  the 
procedure  and  lneerte  tha  nans  In  the  entry. 

(a)  Sequence  PP-LIST.  This  sequence  counts  and 
checks  the  syntax  of  a  fornal  paranetar  list. 


3. A  Stateuent 

A  statement  can  be  a  slnple  statanent  or  a 
coe pound  statssient.  There  are  4  types  of  slnple 
etatanentai  if,  for,  proc-call,  and  asslgrndnt. 

The  first  three  statanonts  are  executed  by  the  CP. 
The  aaalgnsent  eta tenant  la  executed  by  the  f*. 

Two  additional  atatsnanta,  define  and  coanent,  1 
are  handled  by  the  IP.  Tha  eyntax  la  shewn  balow., 

9.  <  statmatatPi  tm‘ <  aiaple-statanant*  1 

|  <conpound-*tatseent> 

10.  <  slnple-atat Ment>  t !“  diaslgiaient-stateaoot > 

!  <loop-statsnont> 

!  <  lf-atstnsent> 

!  <  procedura-call- 
■tataisnt> 

11 .  <  coe  pound- ate  tment> 

1 ! -BEGIN  <  sta tenent> . .  .END 

The  above  syntax  calla  for  tho  following  hardware 
sequences! 

01  STMT 

02  COMPOUND-STMT 
02  SIMPLE-STMT 

Thaaa  sequences  are  explained  below. 

(a)  Statanent.  Sequence  STMT  calla  aaquanca 
SIMPLE -STMT  or  aaquanca  COMPOUND-STMT  by  the 
absence  or  preaance  of  ' BEGIN'  respectively. 

(b)  Compound  Statanent.  Sequence  CGMPOUND-8TMT 
calla  aaquanca  STMT  one  or  more  tines. 

(c)  Slnple  Statement.  Out  of  tha  four  types  of 
slnple  statements  the  IF  and  LOOP  statenants 
can  ba  positively  Identified  by  'IP'  and  'POR', 
respectively.  On  the  other  hand,  proc-call 
and  asalgnsents  begin  with  a  nans.  However, 
the  proc-cell  statanent  begins  with  a  procedure 
naae  which  xuet  have  bean  declared  and  should 
ba  found  In  the  CAM.  If  it  la  not  found,  tha 
naaa  la  aseuned  to  be  tha  data  naaa  of  an 
aaalgment  atatasant. 

3.3  Loop  Statanent 

Tha  loop  statanent  In  Machine  A  la  tha  POR- 
statanant.  It  hea  a  control  variable,  an  initial 
value,  and  an  lncrwantal  value .  Thera  la 
additionally  a  while-clausa  which  aata  tha 
condition  to  terminate  tha  looping.  Tha  ayntax 
la  shown  balow. 

12.  <  loop  statement*  1 1 -POR <  variable* t <  integer 

formula* 

BT  ^integer  formula* 

WHILE  <  boolean-f oraula* ; 

*  statanent* 
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*he  above  syntax  call*  for  Che  following  hardware 
aaquenca*. 

01  LOOP-STMT 
02  SCAM-STMT 
02  FIRST-TINE-LOOP 
02  REPEAT-WHILE 

The  loop  statement  facet  3  consideration*: 

(a)  looping,  (b)  (Mating  of  loop  statements,  and 
(c)  f  irat-tlma  problem. 

(a)  Looping.  The  looping  requires  computation  of 
new  value  of  the  control  variable  and  arvaluatlon 
of  the  boolean  formula.  If  the  evaluated  result 
la  true,  the  loop  body  le  executed ,  and  If  the 
loop  body  la  executed  for  the  first  tins,  the 
EXIT-PTR  la  inserted  in  the  CAM  entry.  If  the 
evaluated  value  ie  falee,  the  loop's  statement 
la  scanned  and  the  EXIT-PTR  narked,  or  execution 
la  directly  Juapad  to  EXIT-PTR. 


executed..  During  successive  .tines,  no 
scanning  la  needed  since  all  pointer*  have 
been  established. 

(<i)  Optional  Eloe-clauae.  The  else  flag  la 
available  in  the  CAM  entry  to  indicate 
whether  there  la  an  elsa-clauae.  If  there 
la,  the  elae-flag  la  set  and  the  ELSK-PTR  Is 
Inserted. 

3. ,7  Procedure  Call  Statement 

The  procedure-call  statement  Invokes  the 
execution  of  a  procedure  definition.  It  should 
be  noted  that  the  procedure  definition  nay  occur 
before  or  after  a  procedure-cell  statement.  If 
It  Is  before,  the  location  of  the  procedure 
definition  can  be  found  fron  the  CAM.  If  It  la 
after,  the  prograa  execution  has  to  be  suspended 
and  the  source  prograa  le  scanned  until  the 
procedure  definition  le  found.  The  syntax  of  the 
procedure  call  stataaant  la  shown  balow. 


(b)  Keating  of  Loop  Stataaant*.  Tha  n* sting  of  loop 
atatenents  (and  If  atataMOts)  la  handled  by 
pushing  it*  CAM  entry  LOCH  fielde  onto  stack 
SPTK-STACK  at  the  beginning  of  the  sequence 

and  by  popping  It  off  at  the  end. 

<e)  Klrot-Tla*  Problea.  If  no  CAM  entry  exists  for 
this  statement,  on*  nuet  be  created.  Tha 
location  and  incraaant  formula  position  are 
stored  In  addition  to  tha  EXIT-PTR. 

3.6  If  Statement 

The  If  stataaant  causes  conditional  branching. 
The  syntax  1*  shown  below. 

13.  <  lf-8tataaent>i ("IP  <  boo lean- formula>j 
<  atatea*nt> 

[ELSE<  stataaent>] 

Tha  If  stataaant  faces  3  consider* t ion*: 

(a)  branching,  (b)  (Mating  of  If  etataaants, 

(c)  f lrat-tlaa  problaa,  and  (d)  optional  else- 
clause.  These  considerations  are  discussed  below. 


14.  '  procedure-c*ll-stat«nent>  :i*<n*a*> 

[ <  actual-parameter-1 lst>] ; 

The  above  syntax  calls  for  the  following  sequences 

01  PROC-CALL-STMT 
02  SCAM-trNTU,-PEOC 
02  PROC-DItPINITION 

The  procadura-call  stataaant  faces  S 
consideration*:  (a)  existence  of  parameters, 

(b)  naatlng  of  proc  cell*  and  returns,  (c)  flrst- 
tlae  problem,  (d)  ahead  or  behind  a  proc  defini¬ 
tion,  and  (a)  Cell-by-value  or  by-reference. 

These  considerations  are  discuss*!  below.  *' 

(a)  Paraaetara.  The  parameters  stay  or  may  not 
exist.  They  can  ba  input  or  output  parameter* 
Their  presence  Is  determined  by  the  parameter 
count  field  of  the  procedure's  CAM  entry. 

The  DP  it  than  activated  to  execute  sequence 
ACTUAL-PARApLIST. 


(a)  Branching.  Tha  branching  requires  evaluation  of 
boolean  formula.  If  tha  evaluated  reeult  1* 
true,  tha  Then-clause  le  executed  and  the 
execution  continues  at  the  location  Indicated 

by  the  ELSE-PTR  if  it  axlets,  end  otherwise  the 
EXIT-PTR. 

(b)  Nesting  of  If  statements.  Tha  nesting  of  If 
atatamante  1*  handled  by  pushing  the  LOCH  f laid a 
of  their  CAM  entrlas  onto  atack  SPTR-STACK  at 
the  beginning  of  the  sequence  and  by  popping  it 
off  at  the  end. 

(c)  Plret-ttae  or  Second-time.  During  the  first 
time,  If  the  boolean  formula  la  trua,  tha  than- 
clauae  1*  executed  but  the  alse-clauae  Is 
scanned  by  sequence  SCAN-STMT.  If- tha  boolean 
formula  le  feist,  the  then-clause  (.*  scanned  by 
sequence  SCAM-STMT,  but  the  elaa-c'lauae  la 


(b)  Naatlng  of  Proc  Calls  and  Returns.  Whan  a 
proc-call  statamant  le  encountered,  the  return 
address  of  tha  calling  procedura  la  puahad 
down  onto  RSTACK,  When  the  execution  of  a 
procedura  raachea  tha  and,  tha  return  address 
la  obtalnsd  from  the  top  entry  of  RSTACK  and 
the  entry  le  then  popped  off. 

(c)  First-time  Problem.  If  the  PROC-NAMP,  register 
la  not  «pty,  the  procedura  1*  called  for  the 
flret  time.  During  tha  first  time,  program 
execution  la  now  changed  into  program 
scanning  until  the  procedura  definition  is 
found.  This  Identification  la  achieved  by 
comparing  each  procedura  name  ancountarad 
during  scanning  with  that  In  the  NAME  field 

of  the  top  entry  of  RSTACK.  The  scanning  la 
don*  by  aaquenc*  SCAN-UNTIL-PROC. 


■Vy.i  ■  , 
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Program  Memory 


Entry 


(a)  Program  storage 


Fig.  1  Program  Storage  and  Direct 
Execution  of  a  High-level 
Language  Program 


<d)  Second- time  Problem.  The  declaration  list 
of  the  procedure  body  ia  not  processed  on 
succeeding  calls. 

(e)  Call-by-value  or  by- reference.  The  parameter 
passing  In  the  J73  as  follows. 

(1)  Formal-input  parameter:  it  must  be  an 
item.  It  la  bound  by  value. 

(2)  Formal-output  parameter:  if  it  is  an  item. 
It  Is  bound  by  value-result.  If  It  is  a 
table,  it  is  bound  by  reference. 

(3)  Actual-input  parameter:  it  can  be  an 
Integer  or  a  floating  formula, 

(4)  Actual-on*, at  parameter:  it  must  be  a 
variable. 

The  evaluation  and  passing  of  parameters  are 
handled  by  the  DP, 

3.8  Declarations 

The  declaration  statement::  consist  of : 

15.  *  dccl-list  >  LL»(<declaration> 

!<define-declaratlon> 

!  BECIN  <  decl-list>END) . , . 

16.  <  declaration* :  :“<l ten-declaration* 

!<  table-dec!  aratlon* 
!<axternal  -declaration* 

The  above  syntax  calls  for  the  following 
hardware  sequences. 


01  DECL-LIST 
02  DECL 

(a)  Sequence  DECL-LIST  processes  the  declarations 
of  a  program-body  or  ptocadura-body .  Names 
may  not  be  declared  twice  in  the  asms  module, 
nor  duplicate  a  procedure-name.  A  define- 
call  is  not  permitted  when  tha  name  of  a 
declaration  Is  tha  next  token  expected. 

'BECIN'  reserved  words  are  stacked  because 
they  nay  signal  either  a  compound  declaration 
or  compound-statement ,  After  all  tha  deola- 
tlons  have  bean  processed  SPTk  Is  adjusted, 
if  necessary,  to  point  to  tha  token  which 
begins  the  first  directive  or  statement. 

(b)  Sequence  DECL  calls  either  ITEM-DECL,  TABLE- 
DECL  or  EXTERNAL-DECL  to  process  a  declaration. 

4.  Data  Processor 

A  J73  program  specifies  data  a law ants  in  data 
declarations  and  type  declarations.  It  also  spec¬ 
ifies  data  operations  by  assignment  statement  a; 
for  example,  the  operations  can  bs  arithmetic  or 
logical.  When  tha  control  processor  Identifies  a 
data  operation,  it  activates  tha  data  processor. 

The  data  processor  DP  directly  exscutas  the 
data  constructs  of  the  J73  language.  It  recog¬ 
nises  data  and  type  declarations,  craatas  data 
descriptors,  and  stores  the  data  descriptors  in 
the  data  associativa  memory  DAM.  The  data  des¬ 
criptors  In  the  DAM  allow  data  rafarances  by 
symbolic  names  and  permit  rapid  accsss  of  data 
values  in  the  data  maaory.  In  addition,  the  DP 
executes  assignment  statements,  evaluates 
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Fig.  2  Organisation  of  a  Direct  Execution  Computer 


formulae,  and  handlaa  parameters.  The  structure 
of  tha  data  proceaaor  DP  conalats  of  one  associa¬ 
tive  memory,  1  register,  and  S  stacks  as  shown  in 
Fig.  5. 

(a)  The  Data  Heanry  atoraa  the  values  of  declared 
variables.  One  word  of  storage  is  allocated 
for  each  item  or  table  adamant. 


variaoias 


(b) 


;!(o) 


"(d) 


Ka) 


&» 


Data  Associativa  Memory  DAM  stores  infor¬ 
mation  about  declared  numeric  variables. 

Mama  and  Block-id  fields  identify  each 
entry.  Items  and  one-dimensional  tables 
ara  tha  only  possible  structures.  Possible 
types  are  signed  and  unsigned  integer  float¬ 
ing  real  numbers.  The  Sits  field  identifies 
the  number  of  DM  words  allocated  for  the 
variable.  The  trace-id  field  acts  aa  a 
flag  which  identifies  whether  tha  variable 
is  being  traced. 

Register  TRACI  is  a  two-field  register 
that  serves  aa  a  flag  to  identify 
whether  a  variable  is  tha  object  of  a 
TRACI  directive.  The  FUfi  field  is 
the  switch,  and  the  Kama  field  eaves 
the  name  of  an  assignment  statement 's 
variable  for  uee  in  the  output  sassage 
which  notes  the  variables  new  value. 

Stack  SYNTAX  contains  tha  current 
syntax  productions  being  executed. 

Stack  VIIACK  holds  the  value  and  type 
of  formulas!*  operands  and  intermediate 
results.  Operands  suet  be  items  or 
t»Ue. 


Mask  PITACK  holds  the  BM-locn  of  vsrlables. 
The  DM-locna  of  loop  statement  control 
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on  one  stack  throughout 
execution  of  the  loop  statement,  since  tha 
control  variable  i*  changed  on  every  itera¬ 
tion. 

(g)  Stack  OPSTACK  saves  lower  precendence  opera¬ 
tors  during  evaluation  of  a  formula. 

(h)  Stack  APSTACK  contains  seven  fields  which 
save  information  about  tha  actual  parameters 
of  a  procedure  call.  Five  of  the  fields 
holds  the  value,  type,  structure,  slxe  and 
parameter  type  (input  or  output)  of  a 
parameter.  In  addition,  an  output  parameter’s 
DM-locn  is  saved  (so  its  value  can  be 
returned),  and  its  name  is  saved  if  it  is 
being  traced. 

(1)  Stack  FPSTACK  has  three  fields  to  save  in¬ 
formation  about  an  active  procedure's  formal 
parameter.  The  name  and  parameter  type  make 
up  two  fields.  The  third  (Decld)  is  a  flss 
which  is  set  during  the  procedure's  first 
execution  if  that  parameter  is  declared  in 
the  module.  All  formal  parameters  must  be 
declared.  Also,  the  number,  type  and  struc¬ 
ture  of  actual  and  formal  parameters  must 
match. 


The  date  processor  DP  processes  data  declara¬ 
tions  and  controls  data  flow.  It  la  activated  by 
CP,  but  it  eleo  communicates  with  LF .  The  data 
constructs  that  are  processed  by  DP  are: 

(1)  Directive 

(2)  ltmae  and  table  declarations 

(3)  assignment  statement 

(4)  formula* 

(5)  boolean  formula 

(6)  variable  and  subecript 
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(7)  (oml  parameter* 

(8)  actual  paranetara 

4.1  Directives 

Tha  TRACE  dlractlva  la  a  apaclal  statement 
which  dlracta  a  aaaaaja  to  ha  out  put  ad  whenever  a 


variable  In  tha  atata 
elgned  a  value,  tha  syntax  la: 


Hat  cats  as- 


4.2 


17.  <dlractlva>-  •  !  TRA£E<nan*> , . . . ; 
Item  and  Tabla  Daclaratlona 


Machine  A  accepts  declarations  In  data, 
procadura,  define,  and  block  declarations.  Tha  CP 
executes  define  daclaratlona.  The  DP  axacutea 
data  and  block  declarations.  The  syntax  of  de¬ 
clarations  is  shown  below. 

18.  < ltem-declar at  ion> "  •  ITEM<nane><S!U!F)i 

19.  <table-daclaratlon>::  ■  TABUS' name  > 

l<dimeo*ion>]  •(S!u!r)  • ; 

20.  <dlnansion>::  •  ',(<  integer  foraula>) 

The  above  syntax  calls  for  the  following 
hardware  sequences . 

4 

01  ITEM... DECL 
02  ITEM 

*  '  n-« 

01  TABLE. .  .DECL  ,• 

02  TABLE 
02  DIMENSION 
(2)  Itaa  Sequences 

Iten  saquencaa  consist  of  aaquenca  ITEM. . . 
DECL  and  sequence  ITEM  which  create  an  entry  in 
the  DAM  froa  tha  nans  and  attribute  in  tha  Itam- 
dsclaratlon  and  allocate  a  DM  word. 

(b)  Table  and  Dlnanalon  Sequences 

Table  sequences  consist  of  sequences  TABLE 
...DECL,  TABU,  and  DIMENSION.  Sequences  TABU 
. .  .DECL  and  TABU  create  an  entry  In  tha  DAM. 
Sequence  DIMENSION  calculates  the  value  of  the 
dimension,  which  allows  ona  dlnanalon  and  only 
naads  an  upper  bound  (tha  lower  bound  la  0).  This 
value  la  Inserted  Into  the  si se  field,  and  a 
block  of  contlngnous  DM  words  equal  to  this  value 
ere  allocated. 


4.3  Aeslfi 


t  Stat 


An  aaslgnaart  statement  causae  the  value  of 
a  formula  at  tha  right  of  an  equal  sign  to  be 
assigned  to  the  variable  at  the  laft  of  an  equal 
sign.  A  variable  Is  a  nana  or  a  subscripted  nans. 
A  subscript  la  an  Integer  enclosed  by  a  pair  of 
brackets.  Tha  syntax  la  shown  below. 


21. 


<asslgnnent-atat< 

<fornula> 


at>::  •  <  variable^ 


(a)  Sequence  ASSIGN-STMT 

This  aaquenca  calls  sequence  VAKIABU  to 
identify  the  object  variable,  and  than  calls 
sequence  FORMULA  to  evaluate  the  fornula.  It  than 
stores  the  f omuls 'e  value  Into  the  DM  location 
pointed  to  by  the  top  entry  of  FStACE. 

(b)  sequence  TRACE-CHECK 

Sequence  TRACE-CHECK  identifies  whether  the 
variable  In  an  aaalgnnaat  statement  or  tha  output! 
portion  of  an  actual-paranater-llat  la  being 
traced. 

4.4  Boolean  Fornula 

A  boolean  fornula  rapreaants  a  value  of  TRltf ! 
or  FALSE.  It  occurs  In  tha  IF-clause  or  tha  VHlnJN 
clausa.  It  can  be  either  a  fornula  followed  by  a. 
relational  operator  further  followed  by  a  variably 
or  a  fornula.  The  syntax  is  shown  below. 


24.  <boolean-f  omula> :  t-<fornula>lU<» 
1  <“!>«!  <l>}<fomula>] 


Tha  resulting  value  from  a  relational 


opera 

Aui 


tlon  lg 


either  Integer  1  for  TRUE  or  aero  for  FALSE.  The 
truth  value  of  tha  boolean  fornula 's  result  la 
determined  by  examining  Its  low-order  bit.  A  '1' 
la  TRUE,  *0'  If  FALSE.  This  Implementation  makes 
off  lntagara  evaluate  to  TRUE,  even  Integer  to 
FALSE. 

4 .  5  Formula 

A  fornula  represents  a  value.  It  can  be  an 
Integer  formula  or  a  floating  fornula,  represent¬ 
ing  either  an  Integer  or  a  floating-point  number, 
respectively.  An  Integer  fornula  Is  a  positive 
or  a  ragatlva  Integer  tarn,  which  can  be  added  to 
subtracted  froa  a  succeeding  Integer  tan.  This 
Interned lata  result  can  than  be  added  (or  sub¬ 
tracted)  to  another  Integer  tern,  and  so  on. 

(Tha  arithmetic  operators  are  left  associative.) 

An  Integer  tern  Is  an  Integer  factor,  which  can  be 
multiplied  or  divided  by  succeeding  Integer  fac¬ 
tors  (as  with  terns).  An  Integer  factor  can  be 
an  Integer  literal,  a  variable,  or  an  Integer 
fornula  enclosed  by  a  pair  of  parentheses. 

Floating  formulas  are  similar  to  Integer 
formulas,  except  a  factor  must  be  of  floating 
type.  In  addition,  a  floating  factor  nay  be  a 
call  of  function  FLOAT,  which  converts  an  Integer 
formula's  value  to  floating  fora.  The  syntax  for 
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(a)  Actual  Input  Parameters 

Tha  actual  Input  parameter*  can  be  0  or 
nore  formulas.  Each  formula  Is  evaluated;  and 
its  value,  type,  structure  and  parameter  type  are 
pushed  onto  AF- STACK. 

(b)  Actual  Output  Parameters 

The  actual  output  parameters  can  be  1  or 
mors  variables.  Since  each  of  the  actual  output 
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parameters  that  aren't  tablaa  aat  returned  a  naw 
value,  thalr  DM- locos  are  alao  aavad  In  tha  AP- 
STACK.  Output  parameters  being  traced  alao  have 
their  names  placed  In  tha  Name  field. 

,(c)  Parameter  Matching 

Corresponding  actual  and  formal  parameters 
must  agree  In  type,  structure,  else,  and  input/ 
output  type. 

5.  Lexical  Processor 

The  J73  program  is  a  string  of  characters. 
The  lexical  processor  LP  scans  the  characters  in 
the  source  program,  checks  their  legality,  and 
assembles  them  into  tokens.  Tha  tokens  can  be 
reserved  words  such  as  "ITEM"  and  "IF",  operators 
such  as  "+"  and  names,  or  numbers.  Tha 
lexical  processor  together  with  tha  associativa 
memory  SAM  also  handles  define  declarations  and 
define  culls,  and  comments.  It  also  handlas  the 
directive. 

The  structure  of  the  lexical  processor  LP 
consists  of  an  associativa  memory,  and  registers 
as  shown  in  Fig.  5.  They  are  described  below. 

(s)  Program  Memory  PM  contains  tha  text  of  the 
JOVIAL  program  to  be  executed.  It  is  arranged 
as  one  long  string  of  characters.  Each 
character  is  assigned  an  ordinal  position  so  it 
can  be  identified  by  register  SPIR. 

(b)  Scunner  Associative  Masttry  SAM  stores 

inf  enaction  about  define  declarations.  For  each 
valid  define  declaration,  an  entry  is  created 
to  store  the  name  of  tha  declaration,  the  location 
of  the  first  character  of  the  define  body  and  the 
first  after  the  last  character  of  tha  define  body. 

(c)  Table  LECALCHAR  contains  valid  characters 

of  the  JOVIAL  syntax  and  their  respective  classes. 

(d)  Table  RESERHORD  contains  reserved  words  and 
their  type.  Special  reserved  words  ere  'DEFINE* 
(type  'D')  and  'FLOAT'  (type  'FLOAT');  the  others 
ure  type  ' R ' . 

(e)  Register  CHAR  holds  the  last  character 
fetched  from'Program  Memory. 

(f)  Register  CLAS8  holds  the  class  of  the 
character  stored  in  register  CHAR.  The  class,  an 
lnterger,  is  found  by  searching  the  LEGAL-CHAR 
table. 

(g)  Stack  RETURN  eaves  the  SPTR  position 
Immediately  following  a  define  call  so  that, 
after  SPTR  has  advanced  over  the  define  body,  it 
Is  reset  to  the  proper  position  to  continue 
progrnm  execution. 

(h)  Stack  DEF-END  saves  the  end-ptr  poeitions 
of  the  bodies  of  active  define  cells.  When  SPTR 
reaches  the  position  pointed  to  by  the  top  entry 
of  the  stack,  thet  define  cell  ie  completed  end 
a  return  )  "■  performed  by  popping  DEF-END  and 


popping  return  into  SPTR.  Recursive  define  calls 
ara  not  allowed. 

(1)  There  ara  two  tablet:  LEGAL-CHAR  and  RESER- 
WORD.  It  needs  to  check  each  character  of  the 
aourca  program  to  determine  whether  it  la  legal 
by  looking  up  table  LEGALCHAR,  It  naada  to 
determine  whether  the  new  token  is  a  reserved 
word  by  looking  up  table  RESERWORD.  The  legal 
character  table  is  shown  in  Tabls  2;  there  are 
56  legal  characters  in  10  classes.  The  reserved 
word  table  is  shown  in  Table  3;  there  are  19 
reserved  words. 

Tha  lexical  processor  LP  scans  the  source 
string  of  chsrsctsrs,  checks  their  legality,  and 
assembles  them  into  tokens.  It  is  activated 
by  either  CP  or  DP.  The  lexical  constructs  are: 

(1)  token 

(2)  character 

(3)  name 

(4)  number 

(5)  operator 

(6)  define  and  comment 

The  hardware  sequences  of  the  LP  which  have 
sequence  NEXT-TOKEN  as  the  root  sequence  consists 
of : 

01  NEXT-TOKEN 
02  NEXT-CHAR 
02  NAME 

02  DIRECTIVE-NAME 
02  DEFINE-DECL 
02  DEFINE-CALL 
02  REL-OP 

02  NUMBERICAL-LITERAL 
03  EXPONENT 
03  FRACTION 
02  COMMENT 

5 . 1  TOKEN 

Token  ie  the  lexical  element  of  a  eource 
program.  It  can  be  a  name,  a  number,  an  operator, 
or  a  separator  as  shown  in  the  suntex  below. 

32.  <next-token>  <name> 

I  <numeric~llteral> 

!  <operator-separator> 

!  <reserv#d-word> 

38,  <operator-sepsrator> 

(!)!:!;!,  i*!  + 

1  -  I  *  !  /  1  "  !  '  I  .  1  <> 
!<!>!<«!>■!<  I  >  !  : 

I  ;  I  ,  I  "  l  '  !  ,  .1  1  !  blank 

39.  <reserved-word>  : t-  START I PROGRAM! TERM 

I  BEGIN  I END I ITEM I 
TABLE I  REF 

I  PROCI F0RIBV I WHILE! 
IF!  ELSE 

I  EISlUlF 

Sequence  NEXT-TOKEN  is  designed  to  assemble 
the  adjacent  characters  in  the  source  program  into 
a  token.  It  extracts  the  next  logical  group  of 
characters  (the  next  token)  from  PM.  The  token 
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nay  be  a  reserved  word,  identifier,  numeric-liter¬ 
al,  operator,  (Operator  or  directive.  This 
sequence  also  handle*  def ine-declarationa  (macro 
definition*)  and  def ine-call* ,  because  they  affect 
the  control  flow  of  program  text. 

Initially,  the  starting  position  of  the  token 
is  stored  in  register  BACK-PTR.  Then  the  token  is 
formal.  If  the  token  is  the  reserved  word  'DEFINE' 
a  def lne-declarat ion  is  processed;  if  it  la  an 
identifier  with  an  entry  in  the  SAM  a  def ine-call 
la  processed.  When  the  next  token  to  be  passed  to 
the  other  processors  has  been  formed,  the  'noise* 
following  it  is  skipped  over.  Noise  consists  of 
blanks,  illegal  characters  and  comments.  Upon 
return,  the  token  will  be  in  register  TOKEN,  its 
type  will  be  in  register  TYPE,  and  register  SPTR 
will  be  pointing  to  the  beginning  of  the  next  token 
to  be  formed. 

Sequence  NEXT-TOKEN  fetches  the  next  char  from 
the  source  program  and  then  acts  according  to  the 
class  number  of  the  character  as  follows. 

class  1:  An  illegal  char.  Call  ERROR. 

class  2:  A  blank.  Skip  the  blank. 

class  1;  A  letter.  The  succeeding  characters 
are  assembled  into  a  name.  The  name 
can  be  reaerved  word,  an  UP  command, 
or  an  oparand  name. 

claaa  4;  A  digit  or  period.  The  euceeding 
Characters  art  assembled  into  a 
humbar . 

class  5:  A  decimal  point.  This  case  is  hand¬ 
led  the  seme  as  class  3. 

class  6;  Unary  operator  '+'  or  It  is 

stored  in  register  TOKEN. 

claaa  7:  An  operator.  It  is  stored  in  regi¬ 
ster  TOKEN. 

class  8s  A  '<'  or  '>'.  A  two-character 

operator  ('<*',  or  '<>')  is 

assembled. 

elate  9s  A  '!',  A  directive  name  is  assem¬ 
bled  and  identified. 

class  10s  A  double-quote.  A  comment  is  flush¬ 
ed  out . 

3.2  Character 

A  character  can  be  a  letter,  a  digit,  or  a 
mark.  There  are  10  digits,  26  letters,  and  17 
marks,  as  shown  below. 

36.  <cliaracr.er>  <letter-- 
I  <diglt> 

I  <mark> 

43.  <digit>  ss-  0  I  1  I  2  !  3  !  4  !  5  I  6  ! 

17  18  19 

44.  clatter?  ss-  AIBI CIDIEI FIG! 

1HI11JIK! LIMlNlO 
IPlqlRISlTIUIVIW 
1X1 Y 1 2 

43.  <mark>  :  s*+! -!*! /!  •  |c !-• 

1 . 1 ;  1 , 1 ;  I  (  ! )  I  ' 

1.11  !  blank 


Sequence  NEXT-CHAR  fetchee  the  next  charac¬ 
ter  from  the  source  program  in  program  memory. 

The  next  character  le  pointed  to  by  register  SPTR 
and  becomes  available  in  register  CHAR.  A  test 
must  now  ba  made  to  determine  If  SRTR  points  to 
next  to  the  end  of  a  define  body  by  Comparing  It 
to  register  DEP-END.  If  It  does,  SPTR  la -given 
the  value  of  register  RETURN  (l.e.  to  return  from 
the  define  call)  before  the  next  character  is 
made  available.  The  character  is  then  tested  for 
legality  and  register  CLASS  le  set  to  the  class 
number  of  the  character. 

3.4  Numeric-literal 

A  numerical-literal  is  a  positive  integer, 
and  a  floating  literal  is  a  numerical-literal 
with  a  decimal  point.  The  lexical  rules  for 
numerical-literals  and  floating-literals  ars 
shown  below. 

34.  <numeric-llterel>  "-<lnteger-literal>. 

'.  <f loatlng-llteral? 

33.  ' lnteger-literal>-»<digit>*  * • 

36.  cfloatinR-literal>"*<digit>“  • 

<exponent> 

!  l<dlgit>)*<digit>.* 
(^exponent?) 

37,  <exponant>  .'.‘"E[+!-]<lnteger- 

<literal> 

Sequence  NUMERICAL-LITERAL  needs  to  detect 
the  sequential  combinations  of  digit,  parloid, 

'E',  '+',  and  othar*.  Thare  are  3  sequences 

ns  shown  below. 

01  NUMERICAL-LITERAL 
02  EXPONENT 
02  FRACTION 

Sequence  NUMERICAL-LITERAL  constructs 
nuoerlcal-llttrals.  There  are  two  types:  lntager 
(type  '1')  and  floating  (type  'FL').  Floating- 
literals  have  a  daclaal  point  and/or  an  axponsnt; 
lnteger-lltarals  have  neither.  Sequence  FRACTION 
extracts  the  digits  following  th*  decimal  point 
of  a  floating-literal,  while  sequence  EXPONENT 
extracte  the  exponent  part  of  a  floating-literal. 

3.3  Relational  Operators 

The  operators  of  swchlne  a  consist  of  single 
and  double-character  operators  and  the  reserved 
words.  Sequence  RSL-OP  extracts  the  relational 
operators  '<',  *>’,  'o',  or 

3.6  Comment 

The  comment  is  a  st-ing  of  0  or  more 
characters  enclosed  by  a  pair  of  quotes,  Tha  syn¬ 
tax  is  shown  below, 

35.  <commant>"»"(<chsractar>)" 
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Sequence  COMMENT  flushes  out  the  string  of  charac¬ 
ters. 

5.7  Define  (Fig.  27) 

The  deflne-declaretion  Is  a  macro  definition; 
ita  body,  like  a  comment,  Is  a  string  of  0  or 
more  characters.  The  syntax  la  shown  below. 

43.  <def  lne-declaratlon>;;«DEKlNE^name> 
"(<character>. . . ]" 

46.  <def lne-call>"*<name> 

Sequence  DEFINE-DECL  processes  a  define  - 
declaratlon.  The  deflne-name  cannot  be  the  same 
as  any  name  declared  In  the  same  module  or  any  pro¬ 
cedure  name.  A  SAM  entry  la  created  to  hold  the 
name,  module-ld,  location  of  the  first  character 
of  the  deflne-body,  and  location  of  the  double- 
quote  (')  which  signals  the  end  of  the  dfefine- 
doby  for  each  valid  declaration.  The  deflne-body  . 
is  enclosed  In  double-quotes,  so  no  comments  are 
allowed  betwe.en  the  deflne-name  and  deflne-hody. 

Sequence  DEFINE-CALL  processes  a  define-call. 
A  define-call  Is  not  allowed  when  the  name  of  a 
declaration  or  a  procedure  is  the  next  token  ex¬ 
pected.  On  a  valid  call,  the  return  location  is 
saved  by  pushing  it  onto  stack  RETURN,  register 
SPTR  la  assigned  to  point  to  the  first  character 
of  the  deflne-body  and  the  and  position  of  the 
deflne-body  is  pushed  onto  stack  DEF-END, 

The  top  of  RETURN  identifies  the  'define- 
uctivatlon  level'  of  the  source  program.  This 
level  needs  to  be  known  by  the  CP  to  determine  If 
control  statements  may  have  CAM  entries,  so  It  la 
always  atored  In  register  D-LEV. 

6.  Concluding  Remarks 

The  above  JOVIAL  Direct-Execution  Machine  A 
directly  reflects  the  language  constructs  of  the 
J73  language.  The  lexical  processor  directly  re¬ 
cognizes  tha  legal  characters,  reserved  words, 
operands,  operators.  It  assembles  token,  and  exe¬ 
cutes  lexical  "caanande"  (such  aa  tha  DEFINE 
'  constructs  of  tha  J73  language.  Tha  control  pro- 
caaaor  directly  executes  tha  control  etateaants 
and  saquencea  the  order  of  execution  of  the  as¬ 
signment  statements;  this  control  processor 
organization  reflects  the  control  constructs  of 
tha  J73  language.  The  data  procassor  dirqptly  re¬ 
ferences  symbolic  names  and  executes  date  opera¬ 
tions;  this  data  procaseor  organization  reflects 
the  data  constructs  of  the  J73  language. 

The  above  JOVIAL  Mach In a  A  la  a  multipro¬ 
cessor  system;  each  processor  performing  e 
function  reflecting  language  constructs.  If  the 
laxical  processor  were  operated  in  a  parallel  but 
.  synchronized  manner  with  the  control  processor  and 
:  data  processor,  tha  rapaatad  laxical  processing  in 
a  program  loop  would  not  Impede  the  execution 
speed.  By  using  tha  Information  of  tha  control 
structure  of  the  source  program  In  the  associative 
memory  CAM,  there  need  be  no  repeated  syntactical 


processing  of  the  control  statements  in  a  program 
loop. 

The  Idea  of  a  direct-execution  machine  la 
simple,  but  Its  structure  can  be  highly  complex 
If  the  programming  language  such  as  JOVIAL  it 
complex.  Thus,  there  are  two  issues:  the  issue 
of  the  programming  language  and  the  issue  of  the 
computer  architecture  for  the  programming  langu¬ 
age.  Criticisms  on  a  particular  direct-execution 
machine  should  address  clearly  the  whether  It  is 
the  language  Issue  or  It  is  to  the  architecture 
issue. 
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Abatract 

Eaaaarchara  have  realised  that  von  Ntunann 
machines  do  not  adequately  provide  for  tha  con- 
a  true  t  a  that  occur  in  coaaaon  programing  languages. 
Ho  at  of  thoaa  abort cooing a  are  attributable  to  a 
phenomenon  known  aa  semantic  gap.  Ovar  tha  paat 
decade  there  has  bam  increased  interest  in  building 
machines  that  hava  mailer  semantic  gap.  It  can  be 
conjectured  that  there  exlata  an  'ideal'  directly 
executable  language  (OIL)  which  daacribea  an  archi¬ 
tecture  with  a  ana  Her  aamntic  gap  than  conven¬ 
tional  machines.  The  proof  of  this  conjecture  will 
enable  us  to  evaluate  candidate  machine  instructions 
and  to  select  the  moat  suitable  machine  language 
for  a  given  computing  environment.  In  order  to 
prove  thle  conjecture,  certain  characteristics  of 
machines  like  tha  level  of  a  machine  with  respect 
to  a  high-level  language  must  be  quantified. 
Halstead's  Software  Science  metrics  are  used  for 
this  purpose. 


Introduction 

Before  we  start  our  Introduction,  we  would 
like  to  define  preclaely  the  meaning  of  the  term 
architecture  as  used  in  this  paper.  Computer 
architecture  la  tha  virtual  machine  aa  viewed  by  a 
machine  language  pragraamsr .  This  is  the  view  held 
bv  Flynn  (75).  Thus,  changing  machine  language 
(assembler  language)  changes  the  architecture. 

Using  the  same  argument,  all  models  of  IBM/370  have 
the  same  architecture. 

Researchers  have  realized  that  von  Neumann 
machine*  do  not  adequately  provide  for  the  con¬ 
structs  that  occur  in  common  programming  languages. 
Moat  of  theae  shortcomings  arc  attributable  to  a 
phenomenon  known  as  semantic  gar  (Gagllardt  [731). 
The  eaientlc  gap  la  a  measure  of  the  difference 
between  the  coucepte  in  high-level  languagee  and 
the  concepte  in  computer  architecture.  Meet  current 
ayetami  have  an  undesirably  large  semantic  gap  in 
that  the  objects  and  operations  reflactad  in  their 
architecture  are  rarely  closely  related  to  the 
objects  and  operations  provided  in  programming 
languages.  Aa  shown  by  Mysrs  [78],  this  larga 
semantic  gap  contributaa  to  software  unreliability, 
performance  problems,  excessive  program  site,  com¬ 
piler  complexity  and  distortions  of  the  programming 
languagee,  all  of  which  contribute  negatively  to 
the  economics  of  data  processing. 


The  ■  Mantle  gap  can  ha  raduced  by  construct¬ 
ing  a  high-level  language  machine  for  each  languag  i. 
Such  high-level  language  machine#  have  many  advan¬ 
tages  (Tannenbaum  [76]).  Over  tha  paat  decade, 
there  has  bam  increased  Interest  In  building 
machines  that  hava  smaller  semantic  gap.  Theae  | 
attempts  are  surveyed  In  Carlson  [73]  and  Myers  I 
[78].  The  proposed  designs  fall  Into  3  categories) 

1.  'Truly'  high-level  language  proceaeore. 

2.  'Pseudo'  high-level  language  processors. 

3.  Intermediate  language  proeeasora. 

In  'truly'  high-level  language  proeeasora, 

(e.g.  Bloom  [73])  tha  processor  accepts  e  program 
string  written  In  a  high-level  language  and  per¬ 
forms  operations  as  determined  by  the  semantics  of 
tha  program  string.  Tha  Important  charactarlatlc 
of  this  daslgn  is  that  tha  architecture  oparataa  op 
the  program  directly.  A  little  thought  will  con¬ 
vince  the  reader  that  auch  a  design  la  not  tha 
Ideal  alternative  to  von  Neumann  architectures  from 
either  the  memory  alas  standpoint  or  interpretation 
time  standpoint  (Uoavel  [74]). 

In  'pseudo'  high-level  language  processors, 
(e.g.  Burkla  et.al.  [78])  the  aourca  program  la 
preprocesssd;  tha  software  preprocessor  performs  a 
lexical  transformation  on  tha  Input  changing  tha 
keywords  and  operators  into  internal  coda.  All 
data  object*  In  the  program  are  replaced  by  refer¬ 
ences  to  memory  locatione.  With  the  exception  of 
■uperf Iuom  blanks,  preprocessing  Is  en  isomorphism. 

The  two  high-level  language  processor  designs 
described  above  are  highly  source  language  depen¬ 
dent  and  so  a  machine  should  be  constructed  for 
each  high-level  language.  In  the  case  of  inter¬ 
mediate  language  processors,  the  source  program  is 
converted  into  a  program  in  an  intermediate  language. 
The  resulting  surrogate  program  is  executed  by  the 
architecture.  It  hae  been  established  (Ueda  and 
Schneldtr  [73]  and  Lancaster  [72], [76])  that  a 
certain  eat  of  semantic  primitives  can  adequately 
axpreaa  the  major  portion  of  tha  seawntlcs  of 
programs  wvittsn  in  any  of  tha  several  common  high- 
level  languagee.  Therefore,  it  la  coujecturad 
(Wade  and  Schneider  [73])  that  by  designing  a  com¬ 
puter  organization  vhlch  imp lament*  a  aat  of  amnes¬ 
tic  primitive*  that  descrlba  common  high  level 
constructs,  one  instruction  per  primitive,  speed 
Increases  approaching  that  of  a  'truly'  high-level 
language  processor  can  be  achieved  while  retaining 
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Hie  flexibility  characteristic  of  software  dominated 
conventional  machines. 

The  authors  believe  that  the  intermediate 
language  processor  is  the  desirable  choice.  The 
authors  ulao  believe  that  there  exists  a  direct  re¬ 
lationship  between  the  level  of  a  target  machine 
with  respect  to  the  source  language  (cf.  SECTION  2) 
and  the.  machine's  dependence  on  the  source  language. 
That  is  to  say,  the  higher  the  level  of  a  machine 
with  respect  to  a  language,  the  more  language  de¬ 
pendent  will  the  machine  be. 

Because  of  thia  relationship  one  can  measure 
the  closeness  of  a  language  to  the  machine.  It  can 
be  conjectured  that  there  exists  an  'ideal*  directly 
executable  language  (DEL)  which  describes  an  archi¬ 
tecture  with  a  smaller  eamantlc  gap  than  convan- 
tlonal  machines.  Hoevel  [74]  gave  an  analytical 
argument  to  ahow  tha  existence  of  an  'ideal' 
directly  executable  language  which  performs  better 
than  conventional  machines.  In  order  to  prove  tha 
above  conjecture  we  must  quantify  certain  character¬ 
istics  of  machines  like  the  level  of  a  machine  with 
respect  to  a  source  language  and  semantic  gap.  Thia 
Is  the  topic  of  present  research.  The  metrics  de¬ 
fined  in  this  work  are  based  on  Halstead's  (Halstead 
(77])  Software  Science.  Thia  reeearch  la  a  step  In 
tha  direction  of  quantifying  architectures  and  la 
an  attempt  to  bridge  the  gap  between  language  de¬ 
signers  and  computar  architects.  The  metrics 
defined  can  be  used  either  to  evaluate  candidate 
intermediate  languages  and  select  th't  moat  suitable 
machine  language  for  e  given  computing  environment 
or  to  evaluate  existing  machines  for  a  given 
environment. 

Hoevel  [74]  has  arguad  that  neither  machine 
language  of  conventional  machines  nor  source  lan¬ 
guage  la  an  'Ideal'  DEL  either  from  Interpretation 
standpoint  or  from  storage  point  of  view.  Ha  con¬ 
tends  that  an  'Ideal'  DEL  for  a  contemporary  comput¬ 
ing  system  lies  somewhere  between  its  source  lan¬ 
guage  and  the  language  accepted  by  lte  base  machine, 
fn  thia  research,  we  attempt  to  prove  that  an 
'ideal'  DEL  from  semantic  gap  standpoint  also  lies 
somewhere  between  the  source  language  and  machine 
language.  In  the  next  section,  software  metrics 
that  will  be  used  to  quantify  architectures  are 
defined.  Results  obtained  so  far  are  Included. 

Section  2 

Halstead  end  hie  students  (Halstead  [73], (77] 
and  Software  Engineering  [79])  found  that  applica¬ 
tion  of  the  classical  methods  of  natural  sciences 
demonstrate  that  even  such  Intangible  objects  as 
written  abstracts  and  computer  program*  ere  governed 
by  natural  lews.  Sons  of  the  nettles  used  by  them 
that  ere  pertinent  to  present  work  ere  now  presented 
without  explanation.  Interestad  readara  should 
refer  to  Halstead  [77]  for  details. 

1.  The  Volume  Vi  A  suitable  nettle  for  the  sire 
of  any  Implementation  of  an  algorithm,  called  the 
volume  V,  cen  be  defined  ee 

V  -  N  log 2  n  (I) 


where  N  is  its  length  and  n  Is  the  size  of  Its 
vocabulary. 

2.  The  Potential  Volume  V*;  The  moat  auccinct 
form  in  which  an  algorithm  could  ever  be  expressed 
would  require  the  prior  existence  of  e  language  in 
which  the  required  operation  waa  already  defined 
or  implemented,  perhaps,  ea  e  subroutine  or  a 
procedure.  The  potential  volume  of  an  algorithm 
is  the  volume  of  the  program  which  expresses  the 
algorithm  in  its  most  auccinct  form'. 

V*  -  (2  +  n*)  log2  (2  +  n$)  (2) 

where  n!|  is  the  number  of  unique  operands. 

3.  The  level  of  a  Program  L:  Since  there  can  be 
more  than  one  possible  implementation  of  an  algo¬ 
rithm,  it  is  necessary  to  define  the  level  of  a 
program.  The  level  of  a  program  L  la  defined  as 

L  -  V*/V  (3) 

4.  Tha  Leval  ct  a  Language  A;  When  different 
algorithm*  are  programmed  In  a  given  lmplmentatlon 
language.  It  is  observed  (Halstead  [77])  that  as 
tha  potential  volume  V*  Increase*  the  program 
level  l.  decreeaes  proportionately.  Consequently, 
the  product  L  times  V*  remains  constant  for  nny 
languag*.  Tills  product,  the  language  level,  is 
denoted  by  At 

A  »  L*  V*  (4) 

The  four  quantities  defined  above  form  the. 
basis  of  our  research.  In  order  to  discuss  the 
details  a  few  more  terms  must  be  introduced. 

5 ■  Level  of  a  machine  with  reBpect  to  a 
Language  3^s  Certain  machines  are  more  closely 

related  to  the  operations  and  data  structures  in  a 
high-level  language  than  other  machines.  A  measur¬ 
able  quantity  that  describes  this  characteristic  of 
a  machine  is  in  order,  The  level  of  a  machine  with 
H 

respect  to  a  language  3j  is  defined  na 


V  is  tha  volume  of  an  algorithm  lmplamsntatlon  in 
L 

the  language  L  and  la  the  volume  In  the  ttachlne 
language  of  the  machine  M. 

Remarks  i  1.  The  authors  strongly  believe  that  the 
quantity  In  aquation  (5)  Is  a  constant  for  a  given 
machine  M  end  e  language  L  (and  a  compiler)  end 
does  not  very  significantly  with  either  algorithms 
or  progressing  styles. 

2.  Compiler  overhead  ie  Included  while  measuring 
volume  Vg  In  equation  (3).  Thus,  is  the  volume 

of  the  program  translated  into  machine  language  M 
by  a  compiler  starting  with  the  program  in  the  high- 
level  language  I..  This  approach  1b  used  for  prac¬ 
tical  rea Buns. 


3.  If  compiler  overhead  Is  to  ba  excluded,  a  dif¬ 
ferent  metric,  tha  Potantial  Laval  of  a  Hachlna 

»L  aay  ba  used! 


»L  ■  VAL  <6> 

where  A^  la  tha  laval  of  tha  aachlna  language  of 
aachlna  M  and  A^  la  tha  laval  of  tha  hlgh-laval 

1 anguage  L.  Potantial  laval  can  ba  graatar  than  1 
alnca  it  la  poaalbla  to  hava  a  aachlna  languaga 
who a a  laval  la  hlghar  than  that  of  a  hlgh-laval 
language.  Tha  performance  of  a  coapllar  can  ha 
evaluated  ualng  tha  two  lavela  defined  above. 

Soaa  Results!  32  FORTRAN  ptograaa  written  by  grad- 
uata  and  fraafaaaa  caaputar  aclenca  atudanta  at  SMU 
are  uaad  In  our  validation  of  aquation  (S). 
Operatora  and  Operanda  In  tha  prograaa  uaad  era 
counted  according  to  tha  rulea  auggaated  by  Bulut 
[74] .(73).  Tha  reaulta  era  given  In  Table  1.  When 
thaaa  valuer  ara  plotted  (fig.  1),  a  etralght  line 
relation  between  the  two  voluaea  with  a  correlation 
coefficient  of  0.978  la  obaarvad.  Fran  the  plot, 
the  level  of  Conpaea  (aeaaabler  languaga  of  Cyber) 
with  raepect  to  FORTRAN  (ualng  FTN  coapllar  with 
OPT  ■  0)  la  given  by  tha  alope  of  the  curve! 

}Coapaae  _  0§17169M 


Similar  coaputatlona  ara  parforaed  on  COBOL  and  tha 
raaulta  ara  tabulated  In  Table  2.  Tha  laval  of 
Coapaaa  with  reapect  to  COBOL  la  calculated  to  be 
(Pig.  2) 


languaga  and  dT  la  tha  asacutlon  time  of  tha 
prograa  for  the  dynamic  volume  V^. 

Remark. a ;  1.  To  evaluate  aquation  (7),  a  prograa 

(or  a  eat  of  prograaa)  auat  ba  executed  with  dif¬ 
ferent  aata  of  data.  For  each  aat  of  data,  tha 
dynamic  voluaa  V.  and  tha  execution  tlaa  dT  auat  ba 
noted.  Than,  the  Integration  la  aquation  (7)  can 
ba  approximated  by  auaaation. 

II 

2 .  Tha  product  la  a  maaaura  of  the  apaed 

at  which  prograaa  written  in  a  hlgh-laval  languaga L 
are  executed  on  aachlna  M. 


Soaa  Raaulta i  A  a  lap la  prograa  la  run  on  Cyber  72 
a  number  of time a  with  varloua  valuaa  for  input 
data.  Tha  valuaa  of  execution  time  for  varloua 
dynamic  voluaea  ara  plotted  In  Fig.  3.  Aa  can  ba 
aaan  from  the  graph,  tha  rata  at  which  Cyber  pro- 
caaaaa  Information  la  fairly  conatant  and  la  given 
by  tha  alope  of  tha  graph. 

^yber  72  “  25.746  *  10®  blta/aac 


plicatlona 


Although  tha  raaulta  obtained  ao  far  are  not 
enough  to  clala  tha  validity  of  our  aatrlca.  they 
tend  to  aupport  our  intuition.  However,  alnca 
Intuition  la  far  from  truatworthy,  wo  ara  planning 
to  collect  data  for  three  languagea  FORTRAN,  Pascal, 
and  COBOL  and  on  three  architacturaa  Cyber,  AMDAHL, 
and  TI  9900.  Ha  believe  that  thla  aat  la  a  repre- 
aentatlve  claae  of  languagea  and  aaehlnea  moat 
commonly  uaad. 


Coapaaa 

COBOL 


0.0317147 


6.  Dynamic  Voluaa  Vgi  Tha  voluaa  of  an  algorithm 

daflnad  In  (1)  la  a  atatlc  measure  of  tha  alaa  of 
tha  algorithm  and  It  can  be  uaad  aa  an  aetlaata  of 
the  memory  required.  However,  the  actual  aaount  of 
coda  procaaaad  by  tha  computer  la  different  for 
different  aata  of  data.  Depending  on  tha  Input, 
certain  aegmanta  of  tha  prograa  nay  be  executed 
more  often  than  other  aagaanta.  The  Dynamic  Volume 
of  a  prograa  la  tha  coda  of  tha  prograa  that  le 
actually  procaaaad  for  a  given  aat  of  data. 

7.  Average  Information  Rata  1^:  Since  It  la  pos¬ 
sible  to" »'•’  calve  of  two  aaehlnea  with  the  aaae 
architecture  whore  one  machine  executes  prograaa 
feetar  than  tha  other  (a.g.  the  varloua  modela  of 
IBM/370  aerlaa),  a  maaaura  of  the  proceaelng  apaed 
of  aaehlnea  exist  ba  defined.  Tha  average  Informa¬ 
tion  rate  L.  of  a  machine  M  la  auch  a  quantity  and 
la  given  by" 


where  T  la  a  sufficiently  long  time  period  over 
which  the  behavior  of  the  program  la  observed,  V. 

a 

ie  the  dynamic  voluaa  of  the  prograa  in  the  machine 


Once  tha  conalatancy  of  thaaa  metrics  has  been 
validated,  they  can  ba  uaad  to  select  a  aachlna 
languaga  that  la  beat  suited  for  a  computing  envi¬ 
ronment,  Denoting  the  aat  of  progressing  languages 
under  consideration  by  P,  the  machine  languaga  for 
which  tha  quantity 


l 

LtP 


V»I> 


(8) 


is  maximum  describee  an  architecture  with  a  minimum 
seaantlc  gap  for  tha  aat  of  programing  languages  P. 
Tha  conatant  k,  In  aquation  (8)  la  a  weighting 
factor  that  reflecte  tha  frequency  of  ueaga  of 
Languaga  L  in  a  particular  environment.  Typically, 
If  901  of  tha  tlaa  COBOL  la  used  In  a  given  environ¬ 
ment,  kC0BQ^  will  taka  a  value  of  0.9. 


Equation  (8)  can  also  be  uaad  to  evaluate 
axletlng  architacturaa  for  a  given  environment. 

Uaa  of  tha  metrics  daflnad  In  thla  paper  provides 
useful  Information  on  tha  baalc  architecture  of  tha 
machine  and  the  Implementation  details  auch  aa  tha 
Information  proceaelng  rate  are  eeparated  from  tha 
architecture.  This  Information  la  not  provided  by 
benchmarks  which  reflect  only  the  apaed  of  execution 
of  the  benchmark  programs  on  tha  machine.  However, 
the  authors  believe  that  tha  counting  techniques 
auggaated  by  Bulut  [74], [73]  must  ba  refined  before 
existing  architectures  can  ba  compared  ualng  our  metrlca. 


/ 


I 


Observation* 

While  compiling  FORTRAN  program*,  w*  triad 
various  optimisations  that  are  available  on  FTN  com- 
pller.  After  looking  at  the  tod*  generated,  ve 
decided  to  uea  only  the  code  generated  uaing  FTN 
compiler  with  no  optimisation.  The  reason  for  this 
is  thn  fact  that  optimisation  1*  not  linear;  only 
certain  portions  of  the  program  are  optimised.  For 
example,  no  attempt  is  made  to  raduca  the  coda  re¬ 
quired  to  implement  subroutine  calls  and  passing 
of  parameters.  Thus,  if  a  program' has  a  large 
number  of  subroutine  calls,  the  amount*  of  coda 
generated  by  both  optimising  compiler  and  regular 
compiler  are  almost  the  earns.  This  nonlinearity 
leads  to  an  unfair  comparison  of  FORTRAN  programs. 

Us  also  observed  that  on  Cyber  72,  there  era  a 
few  system  Macros  to  execute  moat  commonly  occurring 
FORTRAN  functions  like  format  conversions  for  MAD 
and  WRITE  statements.  Similar  observation  can  be 
made  in  connection  with  COBOL  programs.  So,  tha 
eiithors  would  like  to  stress  the  fact  that  the 
ounbern  obtained  are  for  a  virtual  machine  as 
viewed  by  a  compiler  writer.  However,  the  use  of 
Such  macros  strengthens  our  belief  that  a  new 
Sechlne  language  which  has  a  higher-level  than 
hpnvsntlonal  machine  language  Is  needed  to  improve 
the  performance. 

Conclusion 

In  this  paper,  tha  authors  have  attempted  to 
ntroduce  the  subject  of  their  research.  Tha 
lithors  started  out  with  an  assumption  that  thara 
gists  an  'Ideal*  machine  language  which  has  most 
tha  advantages  of  high-level  language  processors 
1*  retaining  the  flexibility  of  conventional 
Maumsnn  machine*.  In  order  to  prove  this  con- 

_  re,  a  few  me  trice  era  defined.  Using  thaaa 

metrics,  a  most  suitable  machine  language  for  a 
given  computing  environment  can  be  deeignad. 

Although,  the  actual  values  of  our  metrics  may 
change  if  a  different  counting  technique  is  used, 
the  conclusions  ere  etlll  valid.  Tha  values  obtained 
must  be  used  only  to  compare  two  languages  and  no 
significance  must  b*  attached  to  the  absolute  value*. 

In  our  research,  one  basic  assumption  la  that 
tha  language  In  which  a  program  la  written  la  tha 
beat  language  for  that  algorithm.  However ,  we  did 
not  see  any  published  results  claiming  tha  superi¬ 
ority  of  one  language  for  a  particular  application. 
Our  method  can  be  extended  to  evaluate  various 
programing  languages  for  e  given  application.  In 
ordar  to  do  this,  one  has  to  write  a  number  of 
programs  (within  a  given  area  of  application)  in  a 
eat  of  progrartog  languages  and  measure  volumes  of 
these  algorithms  in  the  different  languages.  Tbs 
high-level  language  that  has  an  overall  minimum 
volume  for  the  set  of  program  is  the  best  imple¬ 
mentation  la  agues*  for  the  area  of  application 
under  consideration.  Once  again,  we  would  like  to 
caution  tha  reader  that  tha  counting  tachniquae  may 
have  to  ba  refined  before  our  method  can  be  used 
for  the  suggested  applications. 


It  is  probably  too  early  to  outline  the 
machine  characteristics  that  cause  esmentlc  gap, 
but  we  observed  that  direct  execution  of  a  few 
high-lavel  Instructions  would  enhance  the  perfor¬ 
mance  of  computers  appreciably.  These  Instructions 
art  vary  similar  to  the  samantic  primitives  sug¬ 
gested  by  Lancaster  [72]. 
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vFomo* 

V 

Coapaas 

VFOK*4* 

T 

Coapass 

2031.9704 

6549.3468 

449.4903 

2921.7177 

2347.1734 

13946.4650 

1946.9911 

13916.5500 

294.0315 

1764.1995 

3990.5540 

23131.5030 

421.1M7 

2409.9471 

3110.9361 

19917.9670 

111. 0290 

770.3919 

1596.1231 

9390.9700 

24.0000 

304.2272 

3*3.3762 

1917.2799 

311.2140 

3006.6073 

1155.7944 

6591.7922 

352.3330 

2313.9910 

1063.3293 

9349.3600 

375.0000 

2957.3906 

449.0130 

1900.2761 

201.1415 

1943.3554 

124.0000 

903.3911 

291.3551 

2171.9507 

73.0924 

362. 2120 

315.0000 

2799.0950 

194.4999 

1150.3690 

35.3509 

549.5719 

246.3799 

1923.9272 

953.0371 

3941.6165 

12.0000 

107.5499 

129.0000 

1149.2961 

95.1101 

896.9999 

511.9212 

3646.4257 

1193.0721 

6934.6131 

Tabla  1. 

Validation  of  Equation  (3) 

m  Coapllar  irlth  OK  -  0  la  usod. 

TradUI  U  th*  vjU**  ln  WKtAM.  VrMrtT  t»  tha  voliana  la  Coapasa. 


ass.«j  1*2.  a*  ***.»«  lin.ii  mi. as  am.  re  asas.ss  atu.aa  aaaa.ai  mst.u 
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^Coapaaa 

V COBOL 

V 

Compass 

167.371790 

3700.9530 

227.548950 

3977.9047 

655.131790 

11260.7840 

230.321550 

4645.9535 

403.254150 

10222.6950 

483.308970 

10455.0650 

91.376518 

2422.8076 

21.000000 

527.3324 

46.506993 

1250.2098 

57.359400 

1386.6956 

122.984890 

1951.2472 

339.001500 

7949.2895 

245.969780 

6198.6132 

159.911340 

3831.2925 

286.620880 

5333.9861 

95.908275 

2028.3122 

Tabla  2. 

Validation  of  Equation  (5) 

COBOL  compiler  on  Cyber  72  i*  used. 

V„m_,  la  the  volume  in  COBOL.  V„  la  tha  volume  In  Compass. 

COBOL  Compass 


> 


HHUtaatu 


Figure  2.  Validation  of  Equation  15) 
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vd 

dT 

Vd 

dT 

3244232.049943 

0.305 

68172286.369940 

4.285 

10488234.909940 

0.606 

68172286.369940 

4.245 

13732237.769940 

0.897 

6737339.370993 

0.333 

20976260.629940 

1.237 

10106234.930990 

0.306 

26220263.489940 

1.326 

13474930.330990 

0.676 

31464266.349440 

1.903 

16843623  710990 

0.644 

36708269.209940 

2.327 

20212321.090990 

1.012 

41932272.069940 

2.346 

23381016.470990 

1.178 

47196274.929940 

3.006 

26949711.830990 

1.341 

32440277.789940 

3.297 

30318407.230990 

1.322 

37684280.649940 

3.376 

33697102.610990 

1.680 

57684280.649940 

3.329 

37033797.990990 

2.021 

62928283.309940 

4.031 

40424493.370990 

2.177 

62928283.309940 

3.623 

43793188.730990 

2.317 

Tabla  3.  Validation  of  Equation  (7) 

Vj  is  tha  dynamic  voluaa  In  Compaaa.  dT  la  tha  axacutlon  tlma  on  Cybar72. 


.032  .113  .178  .241  .304  .367  .430  .493  .333  .618  .681 


Flgura  3.  Validation  of  Equation  (7) 

Vd «  10  8  along  X-axla.  dT  along  T-axla. 

(unltat  Vj  in  bit*,  dT  In  taconda) 
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Abstract 

Directly  executed  languages  (DELs)  as  proposed 
by  I lynn  have  variable  sized  fields  for  both  opera¬ 
tors  and  operands.  For  efficient  implementation 
this  architecture  requires  access  to  main  memory  at 
the  bit  level,  and  also  requires  powerful  operations 
on  varialbe  sized  bit  fields  in  the  host  processor. 
For  a  hardware  architecture  based  on  bit  slice? 
processors  and  byte  addressable  memory  it  may  be 
more  advantageous  to  consider  a  byte  oriented  DEL. 
This  simplifies  the  memory  access  hardware  and  makes 
the  decoding  of  the  DEL  code  a  straight  forward  look 
up  procedure.  This  paper  reports  on  a  project  to 
build  a  Pascal  oriented  micro  processor  (POMP)  and 
compares  the  POMP  encoding  of  instructions  with 
those  of  the  DEL  code.  Initial  results  indicate 
that  POMP  code  is  less  than  fifty  percent  larger 
than  DPI.  code  and  hence  will  be  preferable  when 
simplicity  of  Interpretation  is  required. 


Introduction 

A  Pascal  oriented  micro  processor  is  being 
built  at  Trinity  College  Dublin  using  AMD  bit  slice 
processors.  It  will  be  used  for  research  into  the 
emulation  of  intermediate  forms  for  block  structured 
languages.  Pascal  will  be  the  initial  language 
considered  and  will  also  be  used  in  all  examples  in 
this  paper.  During  the  design  an  architecture  to 
efficiently  support  Flynn's  DELs  was  considered. 

It  would  have  required  bit  addressable  memory,  and 
operators  for  variable  sized  bit  fields.  Instead 
an  architecture  based  on  byte  sized  Instructions 
was  chosen  to  give  easier  interpretation  and  a 
simpler  main  memory  Interface.  It  was  also  felt 
that  o  compiler  producing  byte  sized  instructions 
would  be  easier  to  construct  than  one  producing  DEL 
code.  The  only  disadvantage  is  the  loss  of  compact¬ 
ness  of  code.  This  paper  reports  on  Initial 
Investigations  into  the  comparison  of  the  two 
encodings  and  considers  the  tradoff  between  compact¬ 
ness  of  code  and  ease  of  interpretation. 


The  work  described  herein  was  supported  in  part, 
by  the  Army  Research  Office  -  Durham  under  contract 
no.  0AAG29-78-0205. 
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0 irec t]y__ Executable  Languages 

The  aim  of  DEL  code  [1,2,3]  is  to  provide  an 
ideal  architecture  for  any  high  level  language. 

The  encoding  is  not  optimum  in  the  Huffman  coding 
sense  but  Instead  a  compromise  between  compact 
coding  and  ease  of  interpretation.  "An  ideal 
representation  must  be  concise  in  its  coding  of 
identifiers  yet  not  so  concise  that  It  exacerbates 
interpretation"  [2  page  22],  In  this  architecture 
the  scope  of  an  identifier  in  a  procedure  Is  very 
important.  The  address  of  an  identifier  is  given 
as  the  address  (offset)  within  the  contour.  Hence, 
the  number  of  bits  required  to  hold  an  address  is 
given  by  log2  (V)  where  V  is  the  number  of  unique 

Identifiers  within  the  scope.  Operators  can  also 
be  encoded  in  this  manner  but  the  number  of  opera¬ 
tors  is  small  and  hence  a  fixed  encoding  may  be 
used  Instead.  The  DEL  code  instructions  mirror  the 
operations  in  the  high  level  language  giving  three 
address  type  Instructions.  When  the  stack  is 
required  in  expression  evaluation  then  all  loads 
and  stores  come  as  additions  to  the  main  operation 
being  performed.  In  a  sense  they  come  for  free. 

The  loads  and  stores  are  not  explicitly  given  in 
the  DEL  code  Instead  they  are  implicitly  applied  as 
part  of  other  operations.  Thirty  two  formats 
specify  all  the  different  forms  of  the  three  address 
Instructions,  The  DEL  encoding  for  a  number  oi 
expressions  is  given  in  figure  1.  The  encoding 
contains  the  format,  operands  and  operations  fields. 
In  the  format  field  A,  B  and  C  represent  the  three 
operands  of  an  instruction  when  they  are  not  in 
the  stack.  S  represents  the  resulting  operand 
pushed  on  top  of  the  stack,  and  T  represents  the 
operand  on  top  of  the  stack;  which  may  he  popped 
if  required,  and  U  is  the  next  to  top  operand  on 
the  stack. 

In  examples  1  and  2  the  operation  is  performed 
without  the  use  of  the  stack.  In  examples  3  and  4 
the  stack  is  used  and  its  use  is  Indicated  by  the 
format  of  the  instruction.  In  all  the  examples 
there  is  one  DEL  code  instruction  for  each  operator 
In  the  high  level  language  expression.  Note  also 
that  the  load  and  store  stack  are  implicitly 
implied  by  the  format  and  combined  with  the  instruc¬ 
tion  operation.  One  memory  reference  is  saved  in 
example  4  where  the  identifier  K  appears  more  than 
once.  Conditional  statements  and  the  addressing  of 
arrays  can  also  be  accomplished  in  a  similar  manner, 
as  shown  in  figure  2. 
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The  If  statement  In  example  1  produces  a  DEL 
code  Instruction  to  test  the  condition  and  skip  If 
the  condition  Is  not  true,  and  another  Instruction 
to  evaluate  the  expression  K:-K  +  1,  which  Is 
executed,  If  the  condition  Is  true.  In  example  2 
the  array  address  calculation  Is  considered  as  a 
single  operation  along  with  the  assignment,  and 
In  exaaiple  3  It  Is  considered  as  a  single  operation 
along  with  loading  or  storing  from  the  stack. 

This  encoding  produces  compact  code,  anywhere 
from  three  to  eight  times  more  compact  than  that 
produced  by  compilers  for  traditional  machines. 


Pascal  Oriented  Microprocessor 


The  size  of  procedures  written  In  a  structured 
manner,  using  a  high  level  language  tend  to  be 
small  [4].  The  most  frequently  occurring  state¬ 
ment  Is  asslgnamnt,  followed  by  procedure  call.  If 
and  return.  Assignment  statements  tend  to  be  very 
simple  with  the  majority  having  only  one  or  two 
terms  on  the  right  hand  side.  The  majority  of 
procedures  have  a  small  number  of  formal  parameters 
and  a  small  number  of  local  scalar  variables. 

Hence,  the  addresses  of  local  variables  and  the 
most  frequently  occurring  global  variables  may  be 
compactly  encoded.  The  coding  of  procedure  calls 
must  also  be  carefully  considered. 


A  significant  compaction  of  code  can  be  gained 
from  the  fact  that  during  the  execution  of  any 
Pascal  statement  the  state  of  the  processor  1$ 
always  known  e.g.  Integer  or  real.  Between  state¬ 
ments  the  state  of  the  processor  returns  to  the 
null  state.  For  example  the  statements 

var  J,K  :  Integer  ;  A,B  :  real; 

0  K  +  TRUNC(A  +  B) 

produce  the  following  Instructions  for  a  stack 
machine.  The  state  of  the  processor  Is  also  given. 


Instructions  Processor  State 


«>  Null 

LOAD  K 

=*>  Integer 

LOAD  A 

*>  Real 

LOAD  B 

«>  Real 

ADO 

«>  Real 

TRC 

**>  Integer 

ADO 

«>  Integer 

STORE  0 

->  Null 

0  -  Null 

I 

1  -  Boolean 

2  -  ASCII  (Character) 

3  -  Address  (pointer) 

4  -  Bit  address  (for  packed  structures) 

5  -  Integer 

6  -  Real 

7  -  Set 

The  state  Is  contained  In  a  three  bit  field  In 
the  processor's  PSW.  Some  Instructions  are  state 
Independent,  e.g.  LOADS,  and  hence  the  opcode 
range  Is  divided  Into  a  state  dependent  and  a  state 
Independent  range.  Assuming  that  these  ranges  are 
equal  In  size  then  there  are  1152  potential  opcodes. 
For  this  architecture  the  low  end  of  the  opcode 
range  Is  for  state  dependent  Instructions.  The 
opcode  range  X'OO1  to  X'ZF1  hat  been  reserved  for  . 
zero  address  Instructions.  The  opcode  X'10' 
represents  Integer  addition  If  the  processor  state 
Is  Integer  and  real  addition  If  the  state  Is  real. 
The  null  and  boolean  states  are  used  for  uncondi¬ 
tional  Jumps  and  false  Jumps  respectively.  A  few 
lines  from  this  area  of  the  opcode  table  are  shown 
In  figure  3.  Each  opcode  represents  five  different' 
operations  depending  on  the  processor  state. 

Branch  Instructions  are  Implemented  In  both  a  short 
and  long  form.  The  short  form  Is  given  In  this 
area  of  the  opcode  table  and  consists  of  two  bytes 
In  the  following  form. 


000 1. 

opcode  — 1  1 — -  4*  offset 

This  requires  thirty  two  opcodes  X'OO'  to  X'lF'. 

The  long  form  jump  consists  of  an  opcode  followed 
by  a  two  byte  offset. 

Load  Instructions,  which  are  state  Independent, 
are  used  to  load  the  stack  and  also  set  the 
processor's  state.  Eight  of  these  are  provided 
for  each  of  the  states:  boolean,  ASCII,  address. 
Integer,  real  and  set.  Three  bits  within  the  byte 
give  the  local  variable  number  and  the  format  Is 


+++++.. . 

opcode _ L  I 

local  variable  nuafcer 


The  processor  state  Is  null  between  each  state' 
ment  and  Is  set  by  load  Instructions  and,  In  this 
example,  by  the  truncate  Instruction  also. 

Advantage  can  be  taken  of  this  fact  [5]  to  provide 
a  two  dimensional  Instruction  set  thereby  greatly 
Increasing  the  range  of  opcodes  available.  Eight 
states  of  the  processor  are  used. 


The  long  form  of  these  Instructions  Is  used  If  the 
procedure  has  more  than  eight  local  variables  - 
this  will  occur  six  percent  of  the  time  [4]. 

Separate  one  byte  opcodes  are  used  to  perform 
operations  between  the  top  of  the  stack  and  eight 
local  variables.  If  a  local  variable  Is  added  to 
the  top  of  the  stack  this  requires  a  one  byte 
Instruction  rather  than  two  Instructions  In  the 
conventional  stack  machine.  These  Instructions  are 
two  dimensional  In  that  the  meaning  of  the  operation 
also  depends  on  the  processor  state.  The  operation; 


involved  are  add,  subtract,  multiply,  divide, 
compare  for  equality,  compare  for  Inequality  and 
store.  The  null  state  of  this  part  of  the  opcode 
table  cannot  be  used  with  these  operations  and 
hence  Is  used  to  zero  local  Integer,  Increment 
local  Integer,  and  decrement  local  Integer.  These 
Instructions  also  have  eight  different  opcodes  for 
eight  local  variables  and  again  they  replace  either 
three  or  two  Instructions  In  the  conventional 
stack  machine. 

Test  Results 

The  code  generator  (the  as settler)  of  the  P 
code  compiler  was  modified  In  order  to  obtain  a 
feel  for  the  compactness  of  the  POMP  code.  The 
modified  compiler  produces  either  P  code  [6]  or  a 
itombl nation  of  P  code  and  POMP  code  depending  on 
'{he  setting  of  a  number  of  control  flags.  Tnese 
flags  are  used  to  test  out  the  relevent  Importance 
of  compacting  different  P  code  Instructions  rather 
than  only  obtaining  the  total  effect.  The  P  com¬ 
piler  produces  code  for  a  stack  machine  where  all 
operations  are  performed  on  the  top  of  the  stack. 
Hence  no  advantage  could  be  taken  of  the  POMP 
instructions  which  operate  between  local  variables 
and  the  top  of  the  stack.  For  example  the 
expressions  A  :«  B  *  C  and  A  :■  A  +  1  produce  the 
following  P  code  and  POMP  code: 

A  :•  B  *■  C 

P  code  -  LOAD  B  POMP  code  -  LOAD  B 
LOAD  C  MUL  C 

MUL  STA  A 


The  results  were  then  compared  with  the  DEL 
code  produced  by  a  DEL  compiler  being  Implemented 
at  the  Stanford  Emulation  Laboratory.  Tht  object 
code  size  produced  by  compiling  a  quicksort  program 
on  the  three  different  compilers  were: 


DEL 

POMP  code 

P  code 

Size 

292 

430 

1004 

Factor 

1.0 

1.47 

3.44 

The  reduction  in  P  code  size  due  to  the  POMP 
instructions,  broken  down  by  compaction  type  were: 


Compaction 
In  bytes 

Number  of 
Instructions 

Compaction  type 

£ 

34  (  6) 

17 

Short  branches 

267  (47) 

89 

Load  and  Store  local 
variables 

129  (22) 

43 

Load  Integers  0,  1,  2, 
load  boolean  true  or  false. 
Increment  and  decrement  by  1 

144  (25) 

vnr 

48 

TS7 

Zero  address  Instructions - 
operating  on  top  2  elements 
of  the  stack. 

The  total  number  of  Instructions  is  251  for 
both  the  P  code  and  POMP  coda.  In  the  POMP  code 
they  are  broken  down  Into  180  one  byte  Instructions, 
17  two  byte  Instructions  and  54  P  code  Instructions. 
Almost  half  of  the  compaction  Is  achieved  by  com¬ 
pacting  the  load  and  store  locals.  In  contrast  the 
short  branches  had  almost  no  effect  (6X). 


STA  A 
A  :*  A  +  1 

P  code  -  LOAD  A  POMP  code  -  INC  A 
INC  1 
STA  A 

The  four  areas  most  easily  Implemented  and  which 
were  considered  to  result  In  the  greatest  compac¬ 
tion  are: 

1)  Short  branches  -  offset  relative  to  PC 

2)  Loading  and  storing  local  variables 

3)  Loading  small  Intagers  (0,1  and  2),  loading 
boolean  true  or  false,  Increment  end 
decrement  top  of  stack  by  1 

4)  Zero  address  operators  l.e.  acting  on  the  top 
two  elements  of  the  stack. 

The  P  code  compiler  produces  one  P  code  Instruc¬ 
tion  per  32  bit  computer  word.  No  compaction  of 
the  code  was  considered. 


From  the  preliminary  results  It  looks  Improb¬ 
able  that  the  POMP  code  produced  from  the  P  code 
compiler  can  achieve  the  compactness  of  the  DEL 
code.  Each  POMP  Instruction  would  on  average  only 
occupy  1.16  bytes  as  the  DEL  code  for  this  program 
consists  of  only  68  Instructions  compared  to  251 
for  the  P  compiler:  a  factor  of  3.7.  Even  allowing 
for  the  fact  that  DEL  operators  often  have  Implicit 
loads  and  stores  associated  with  them  there  Is  still 
a  remarkable  difference  In  the  number  of  operations. 
Hence  the  P  code  compiler  has  been  discarded  and 
present  work  Is  using  a  Pascal  compiler  which 
generates  an  abstract  syntax  tree  during  parsing. 
Using  this  coepllar  the  full  POMP  code  can  be 
generated  Including  the  Instructions  which  operate 
between  local  variables  and  the  top  of  the  stack. 
Statistics  will  also  be  generated  on  fourteen 
substantial  Pascal  programs  giving  the  frequency 
of  operators  and  memory  references,  end  the 
resulting  DEL  and  POMP  codes  will  be  compared. 

An  advantage  put  forward  for  minimizing  the 
number  of  Instruction*  of  the  object  code  Is  that 
It  speeds  up  the  execution.  A  large  mmfeer  of 
Instructions  Increases  the  fetch  and  decoding  time 
but  with  instruction  prefetch  and  with  simple  ROM 
look  up  decoding  It  Is  expected  that  the  difference 
In  execution  speed  due  to  this  effect  will  be  small. 
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G8ran  Big* ,  L  M  Erictton,  S-126  25  Stockholm,  Sweden* 

Lari-Erik  Thorelli,  Department  of  Talacoenuaication  and 
Computer  Systtni, 

Royal  Institute  of  Technology, 

S-100  44  Stockholm,  Sweden 


The  architecture  of  the  high-level  language 
machine  IAX2,  designed  for  efficiency  in  string 
manipulation  and  interactive  applications,  is 
evaluated  with  respect  to  program  volume  and 
number  of  interpreted  instruction  bits.  The  eva- 
1  no i ion  takes  the  form  of  a  comparison  with  the 
HUP-  I  l  at  alt i  tec ture  using  av  test  data  a  set  of 
compline,  realistic,  programs  from  a  well-known 
source,  the  result  shows  the  superiority  of  the 
hit'll -l ova |  architecture. 


It  turns  out  that  LAX2  uses  significantly  fewer 
bits  for  inatructions ,  both  statically  and 
dynamically.  Thus,  the  present  study  gives  yet 
another  example  of  the  superiority  of  high-level 
architecture,  designed  from  language  and  appli¬ 
cation  considerations,  over  conventional  archi¬ 
tecture.  After  a  short  description  of  the  high- 
level  architecture  the  evaluation  method  and 
results  are  presented.  The  concluding  sections 
compare  the  present  work  with  earlier  evaluation 
studies  and  discuss  the  significance  of  the 
results. 


Introduction 

I  2 

I  .AX  2  '  is  a  high-level  architecture  designed 
lu  he  efficient  for  string  manipulation  and  inter¬ 
active  applications.  It  has  type-marked  values, 
dyunmii.  storage  allocation,  and  powerful  instruc¬ 
tions  lor  string  manipulation.  The  language  of  the 
machine  is  specified  in  two  levels,  a  source  or 
Li'Xl  level  TljVX  and  an  executable  level  ELAX. 

There  are  no  GOTO' a  in  TLAX;  ail  jumps  are  gene- 
i sled  from  high-level  control  structures  by  the 
simple  TLAX  -  ELAX  compiler  which  is  a  fixed  part 
■  >f  the  machine.  Memory  is  splitted  into  a  number 
•j!  data  and  program  blocks;  relative  and  indirect 
addressing  is  used  with  out-of-bounds  checking  tu 
achieve  compact  code  and  high  reliability. 

The  main  design  goals  for  LAX2  are  low  cost  for 
software,  production  and  good  memory  and  execution 
time  economy  for  the  intended  class  of  applications, 
i'lic  Jesign  lids'  been  heavily  influenced  by  the  con¬ 
cepts  of  structured  programming.  The  architecture 
lias  been  implemented  as  a  partially  microcoded 
interpreter  on  a  Varian  V73  minicomputer. 

Vlie  present  paper  reports  oti  an  evaluation  of 
the  1AX2  architecture.  The  evaluation  is  only  con- 
c  .rued  with  memory  and  execution  time  economy, 
leaving  out  completely  aspects  such  as  ease  of  pro- 
gr. miming  and  debugging,  software  security,  and  ease 
ul  compilation.  Furthermore,  the  number  of  inter¬ 
preted  instruction  bits,  rather  than  physical  exe¬ 
cution  time,  is  used  as  the  dynamic  measure.  The 
evaluation  consists  of  a  comparison  of  LAX2  with 
I'DPMl,  using  a  set  of  programs  taken  from  the 
vell-kiujwn  book  Software  Tools  by  Kernigh.m  mid 
I' I  auger  . 

s 

The  work  was  done  while  the  author  was 


Short  description  of  the  high-level  architecture 

LAX2  is  a  tagged  architecture*1 .  Its  design  pre¬ 
supposes  a  basic  word  format  of  16  bits.  Currently 
the  machine  recognizes  types  of  values  according 
to  Figure  1. 

Simple  types:  nil,  boolean,  character,  index 
(integer  in  the  range  0-16383) 
Composite  types: 

string  (of  characters) 

node  (heterogeneous  array) 

decimal  (decimally  represented  integer) 

prog  (executable  procedure) 

coprog  (coroutine  activation) 

channel  (for  input  or  outiut) 

(real,  realarray  planned,  not  vet  implemented) 


Figure  1 ■  1.AX2  date  types 


A  value  of  simple  type  is  represented  by  one 
16  bit  word  with  its  leftmost  bit  cleared.  A  com¬ 
posite  value  is  represented  by  a  16  bit  oru,  the 
head,  whose  leftmost  bit  is  set,  pointing  to  e 
memory  block,  the  body,  containing  a  type-and- 
length  descriptor  and  the  value  proper. 

The  memory  area  of  a  LAX2  process  is  divided 
into  a  stack  in  which  procedure  activation  recordr 
are  allocated,  and  a  heap,  where  compactif ying 
garbage  collection  is  performed  when  necessary 
(Figure  2) . 
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Figure  2.  Memory  area  of  LAX 2  process 


An  executable  procedure,  1.  e.  a  prog  value,  can 
only  be  created  by  means  of  the  LAX2  instruction 
'compile',  taking  a  string,  the  TLAX  veruicn  of  the 
procedure,  as  main  argument.  The  body  ut  a  prog 
value  is  shown  (with  some  simplification)  in 
Figure  3.  Each  rectangle  represents  a  lb  bit  wore. 

' _r 


administrative  v - y. - — — y's - v - 

overhead  own  variables  ELAX  code 
(not  more  than  31) 

Figure  3,  A  prog  value 


The  ELAX  code  can  only  access  the  own  variables 
and  stack  variables  (Locals  and  parameters)  of  the 
current  activation  record.  Figure  A  shows  the  at  rue 
ture  of  an  activation  record  on  the  stack.  The 
Stack  variables  are  also  represented  by  one  word 
each,  and  their  number  may  not  exceed  32.  In  this 
way  addresses  to  commonly  referenced  qualities  are 
kept  very  short.  More  remote  information  is 
reached  through  indirect  addressing.  A  complete 
user  program  consists  of  a  network  of  prog  and 
data  valuta  linked  by  the  own  variables  of  the 
prog's. 


CZL 


address 
(within 
prog)  . 

1  ink  to 

underlying 

record 


locals 


par. mil  t  c  rs 


head  of 

activated 

prog 


stuck  variables 
(one  word  each,  containing 
simple  value  or  brad  of 
composite  value  I 


ELAX  code  consists  of  a  sequence  of  8  bit  bytes. 
The  design  is  similar  to  that  of  EM-1  (Tanenbaum3 ) 
and  is  characterized  by  compactness  and  the  possi¬ 
bility  of  fast  instruction  decoding.  Figure  5  shows 
nomc  simple  stulcments  in  Algol-like  notation  and 
tiieir  ELAX  counterparts. 


Statement  ELAX  code  Mo  of 

bytes 

A:-B+3  push  B,  push  3,  add,  5 

locate  A,  store 

A: *A-B  push  B,  locate  A,  minus  3 

A:»A+1  locate  A,  incr  2 

A: *0  locate  A,  clear  2 


Figure  5.  Simple  ELAX  examples 


LAX 2  has  a  powerful  set  of  string  manipulation 
instructions.  A  small  exampla  is  givan  in  Figura  6. 
The  guiding  principle  has  bean  that  although  it 
should  be  simple  to  dynamically  craate  and  throw 
away  strings,  this  feature  ahould  not  be  forced 
upon  the  programme'.',  and  that  lexical  and  othar 
kinds  of  string  analysis  could  ba  done  with  high 
machine  efficiency.  The  reader  it  referred  to 
(1,2)  for  further  information  on  thie  and  other 
aspects  of  the  LAX2  architecture.  Appendix  A 
summarizes  the  ELAX  instruction  list. 


Problem:  Thu  string  S  contains  an  identifier,  an 
operator  symbol  and  an  unsigned  integer,  pos¬ 
sibly  separated  by  blanks.  Assign  the  identi¬ 
fier  (a  string)  to  A,  the  operator  (a  charac¬ 
ter)  to  OP,  and  the  integer  (an  index)  to  B. 

ELAX  solution:  locate  V,  clear, 

push  S,  locate  V,  got  ideal,  locate  A,  store, 

push  R,  locate  V,  getchsr,  locate  OP,  store, 

push  S,  locate  V,  net  index,  locate  B,  store. 

1'otal  number  ol  bytes'.  17 


Figure  6.  String  analysis  example 


Met  hud  of  evaluation 


Figure  A.  Act,  ivat  ion  record 


Dynamic  type  checking  and  the  static  checking 
performed  by  the  'compile'  instruction  catch  a 
great  number  of  possible  programming  errors. 

Another  feature  promoting  the  efficient  production 
of  reliable  software  is  ebaence  of  jump  instruc¬ 
tions  in  the  TLAX  representation.  All  jumps  are 
generated  during  compilation  from  high-level  con¬ 
trol  structures.  In  many  other  respects  TLAX  offers 
a  rather  primitive  notation  which,  together  with 
the  high  level  of  ELAX,  makes  the  compilation  pro¬ 
cess  simple. 


A  5 
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The  book  Software  Tools  is  highly  'suitable  as  a 
source  oT  benchmark  programs  for  1AX2,  since  the 
programs  are  complete,  have  been  used  in  practice, 
and  are  typical  of  the  application  area  of  the 
machine.  The  programming  language  used  in  (3)  is 
Ratfor,  a  structured  dialect  of  Fortran.  The  fol¬ 
lowing  programs  were  selected  for  use  in  the  in¬ 
vestigation. 

a.  ENTAB  ((3)  pp  37,21,20) . 

Copies  a  text  file,  substituting  each  sequence  of 
spaces  preceding  a  tab  atop  by  a  tab  character. 

Tab  stops  are  located  at  each  8'th  position  in  the 
line. 

Program  size:  A6  lines  of  source  code  (not  counting 
comment  and  blank  lines). 
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(3)  th«  recursive  sort  program  QUICK. REC  waa  pro¬ 
duced  alto  for  PDP-11. 


ti.  COMPRESS  ((3)  p  44). 

Produces  a  compressed  version  of  a  file  using  run 
length  compression,  i.  e.,  a  sequence  of  identical 
characters  is  encoded  by  length  and  character 
value . 

Program  size:  36  lines.  The  results 


c.  PUTDKC  ((3)  pp  61,62,  plus  amain  routine). 
Converts  integers  to  ASCII  format  and  places  them 
in  specified  fields. 

Program  size:  38  lines  (excl.  main  routine). 

d .  QUICKSORT  ((3)  pp  115,110,111,  plus  amain 

rout  i  lie) . 

Sorts  .i  sequence  of  text  lines  into  lexicographical 
order  hy  means  of  the  well-known  "quicksort"  algo¬ 
rithm. 

Program  size:  66  lines  (excl.  main  routine). 

e.  K I  MI)  ((3)  pp  136-138). 

Searches  a  file,  outputting  each  line  containing 
a  certain  pattern  given  aa  input.  The  pattern  is 
essentially  a  regular  expression. 

Program  size:  279  lines. 

The  set  of  progrssis  is  rsthsr  small  but  is 
hoped  to  he  representative  of  the  text  processing 

application  area. 

Next,  these  programs  were  translated  for  the 
two  architectures  LAX2  and  PDP-11. 

The  translation  for  LAX2  was  obtained  as  follows. 
The  programs  were  rewritten  into  the  language  HLAX, 
a  high-level  (above  TLAX)  notation  for  LAX2.  The 
HLAX  programs  were  compiled  using  a  cross-compiler 
on  a  DEC-10  computer.  During  the  rewriting  process 
cure  waa  taken  to  stay  close  to  the  original  pro¬ 
grams  .  As  a  consequence  the  programs  run  on  LAX2 
have,  except  for  minor  details,  ths  same  data  and 
piogram  structures  and  usa  the  some  algorithms  as 
the  original  programs.  This  masns  that  the  features 
of  IAX2  have  not  been  used  to  full  advantage. 
However,  an  additional  version  (FIND. OPT)  of  FIND, 
optimized  for  LAX2,  waa  written.  The  optimization 
relies  mainly  on  the  observation  that  a  majority 
of  search  patterns  consist  of  or  start  by  a  literal 
siring.  Therefore  it  should  pay  to  modify  the  inter¬ 
nal  representation  of  patterns  and  uae  the  sub¬ 
string  searching  ’part1  instruction  of  LAX2.  Also, 
a  recursive  version  (QUICK. REC)  was  written  in 
addition  to  the  non-recursive  version  from  (3). 

To  translate  the  program*  to  PDP-11  code  the 
language  C  (6)  was  used.  As  bafore,  ths  rewriting 
was  done  to  faithfully  preserve  the  given  algo¬ 
rithms  and  structure.  To  obtain  high  quality 
machine  code  all  feature*  of  the  C  language  pro¬ 
moting  this  goal  were  used,  including  the  possi¬ 
bility  of  declaring  quantities  to  reside  in  regis¬ 
ters.  l’he  programs  wars  coopilad  using  ths  opti¬ 
mizing  compiler  available  under  UNIX  .  As  a  result 
u£  these  measures  wa  believe  that  the  machine  code 
is  as  efficient  as  that  produced  by  a  competent 
assembly  language  programmer,  with  the  possible 
exception  that  the  letter  may  in  soma  cases  feel 
Inclined  to  use  a  lass  general  subroutine  calling 
sequence,  to  save  time  for  ths  saving  and  restoring 
of  registers.  In  addition  to  the  five  programs  frees 


The  volumes  of  the  programs,  excluding  input/ 
output  routines,  were  measured.  The  volume  is  de¬ 
fined  as  the  size  of  the  executable  form  of  the 
program  including  statically  allocated  data.  The 
sorting  programs  operate  on  data  in  primary  storage; 
this  space  is  not  included,  as  its  size  depends  on 
the  size  of  the  input. 

The  result  is  displayed  in  Table  1. 


Program 

Result 

PDP-11 

Result 

L.AX2 

LAX2/PDP1 1 
in  t 

ENTAB 

204 

189 

93 

COMPRESS 

1 

PUTDEC 

1 

QUICKSORT 

1 

QUICK. REC 

251 

210 

84 

97 

60 

62 

233 

132 

57 

175 

77 

44 

FIND 

1282 

776 

61 

FIND. OPT 

(1282) 

841 

66 

Total 

2242 

2 

1444 

64 

Notes:  I:  exoluding  main  program 

2 :  excluding  FIND. OPT 


Program  volumes 


1  A  Wat-  unvtl  c  \ 


The  high  percentage  figures  for  the  first  two 
programs  are  explained  by  the  fact  that  they  use 
data  structures  whose  sizes  dominate  over  the  sizes 
of  the  programs  proper. 

In  addition  to  these  static  results,  dynamic 
measurements  were  derived.  The  programs  were  exe¬ 
cuted  on  the  two  machines  and  the  number  of  inter¬ 
preted  instruction  bits  was  recorded.  These  counts 
exclude  all  input/output  handling. 

Ths  following  text  files  were  used  for  input 
during  the  dynamic  measurements  (Table  2)  . 
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File 

Content 

No  Of  ASCII 
symbols 

No  of  lines 

TEXTO 

extract  from 
report 

S80 

13 

TEXT1 

extract  from 
report 

4752 

98 

TEXT2 

extract  from 
report 

3806 

99 

TBXT3 

source  oode, 

C  language 

1038 

51 

TEXT4 

Mil  address 

list 

5697 

100 

TEXTS 

Mil  address 
list 

1139 

20 

Table  2. 

Input  data 

The  HNTAB  and  COMPRESS  programs  were  run  uaing 
files  TEXT1  -  TEXT4  aa  input.  PUTDEC  uses  no  input 
filat  inataad,  the  suin  routine  makes  36  calls  on 
the  converaion  procedure. 

The  sorting  programs  were  used  to  sort  the  lines 
of  TEXT1 ,  TEXT3 ,  and  TEXT4 ,  and  also  an  already 
sorted  version  TEXT4S  of  TEXT4,  resulting  in  worst- 
case  performance. 

finally,  the  FIND  programs  were  run  using  a  col¬ 
lection  of  19  search  patterns  and  the  input  files 
TEXTO,  TEXT3,  and  TEXTS.  Three  groups  of  measure¬ 
ments  ware  performed.  Group  1  uses  simple  search 
patterns  consisting  of  single  literal  strings. 

Group  3  uses  complicated  search  patterns,  and  group 
2  falls  in  between  groups  1  and  3. 

As  in  the  selection  of  the  test  programs  them¬ 
selves,  the  aim  in  the  selection  of  test  data  was 
to  achieve  realistic  and  typical  conditions  with  a 
reasonable  amount  of  effort. 

Table  3  summarizes  the  result  of  the  dynamic 
measurements.  For  FIND  measurements  were  taken  sep¬ 
arately  on  the  pattern  building  part  (PATTERN)  and 
the  pattern  matching  part  (MATCH). 

The  auperiority  of  the  high-level  architecture 
is  evident.  Summing  all  measurements  (omitting  the 
non-recursive  QUICKSORTS),  we  get  the  overall 
figure  28X  for  the  ratio  of  LAX2  to  PDP-11.  However, 
the  variation  acroas  the  programs  is  high,  and  the 
result  depends  on  the  teat  data  used. 

The  case  of  the  optimized  MATCH  shows  highly 
favourably  valuta  for  LAX2,  especially  for  group  1. 
The  main  aaplanation  is  that  the  search  patterns  of 
group  1  consist  of  single  literal  strings,  allowing 
the  search  to  be  performed  by  the  substring  search¬ 
ing  'part*  instruction.  Likewise,  the  patterne  of 
group  2  consist  of  literal  strings  appended  by 
other  constructs,  so  part  of  the  search  can  be 
speeded  up  as  in  the  cate  of  group  1.  The  sorting 


Program 

Data 

Result 
PDP-1 1 

Result 

LAX2 

LAX2/PDP11 
In  t 

ENTAB 

TEXT  1-4 

7105 

4357 

61 

COMPRESS 

TEXT1-4 

9599 

7308 

76 

PUTDEC 

- 

130 

46.3 

36 

QUICKSORT 

TKXT1 ,3,4 

2942 

421 

14 

QUICKSORT 

TEXT4S 

6000 

558 

9 

QUICK. REC 

TEXT  1,3,4 

2925 

354 

12 

QUICK.  REC 

TBXT4S 

5925 

516 

9 

FIND: 

PATTERN 

all 

patterns 

667 

187 

28 

FIND: 

group  1 

8323 

2768 

33 

MATCH 

group  2 

16588 

4778 

29 

group  3 

23286 

6166 

26 

group  1-3 

48197 

13712 

28 

FIND. OPT: 
PATTERN 

til 

patterns 

(667) 

121 

18 

FIND. OPT: 

group  1 

(8323) 

68.8 

1 

MATCH 

group  2 

(16588) 

1102 

7 

group  3 

(23286) 

6630 

28 

group 1-3  (48197) 

7801 

16 

Table  3.  No  of  interpreted  instruction  bits 
(unit!  1000  bits) 


programs  also  perform  batter  than  avaraga  on  LAX2, 
mainly  due  to  the  uaa  of  string  comparison  instruc¬ 
tions  built  into  LAX2. 

The  least  favourable  case  for  LAX2  i»  the 
COMPRESS  program.  A  cloaer  look  show*  that  this  la 
the  program  with  the  lowest  fraquancy  of  procedure 
calls.  Procedure  calling  is  more  efficient  in  the 
high-level  machine  then  in  PDP-11,  end  the  genera¬ 
lity  of  the  cell-return  sequence  produced  by  the  C 
compiler  emphasizes  ths  diffsrsnes.  Code  optimiza¬ 
tion  across  proesdure  boundaries  can  be  expected  to 
improve  the  PDP-11  results  in  some  cases.  Such 
optimization  is  however  a  complex  task. 


Discussion 

An  objection  to  ths  rasults  prasentsd  is  that 
the  influence  of  data  atoragt  and  accaasing  has 
been  neglected.  Additional  qusstions  may  be  raised 
concerning  the  relevance  of  the  number  of 


interpreted  instruction  bits  as  an  architectural 
measure.  These  issues  will  now  be  discussed.  In 
addition,  the.  present  work  will  be  related  to  sim¬ 
ilar  published  investigations. 

The  volumes  displayed  in  Table  1  do  not  include 
storage  allocated  dynamically  during  execution.  How 
would  this  dynamic  storage  requirement  affect  the 
comparison i  A  look  at  the  test  programs  shows  that 
the  effect  is  small.  Only  the  sort  programs  can 
allocate  more  than  in  the  order  of  10  words.  The 
sort  programs  use  one  more  word  per  line  of  input 
in  1.AX2  than  in  PDP-11,  due  to  the  use  of  type-and- 
longth  descriptors.  With  the  test  data  used  this 
amounts  to  a  bZ  increase  in  data  storage.  The  stack 
frames  in  I.AX2  are  smaller  than  thoae  uaed  by  the 
PIM’-tl  code.  The  influence  of  this  difference  is 
small,  however.  The  recursive  sort  programs  grow  a 
stack  whose  depth  is  only  log2(n)  frames,  where 
n  is  the  number  of  input  lines. 

1’he  number  of  interpreted  instruction  bits  has 
been  shown  lo  bo  small  for  LAX2  (Table  3).  It  might, 
however,  be  suspected  that  the  number  of  memory 
references  during  access  to  data  is  higher  for  LAX 2 
than  for  the  conventional  machine,  since  each  com- 
ponite  value  la  equipped  with  a  one-word  descriptor. 
Unfortunately  no  mechanism  was  available  for  moni¬ 
toring  this  effect.  Inspection  shows  that  in  the 
case  of  Lhu  test  programs  the  descriptor  references 
would  add  well  below  57.  to  the  execution  time.  The 
actual  figure  is  of  course  quite  implementation 
dependent,  for  instance,  a  cache  memory  would  pro¬ 
bably  almost  eliminate  the  overhead. 

Hie  number  of  interpreted  instruction  bits  (NIB) 
i.s  an  architectural  measure  clearly  related  to  exe¬ 
cution  speed.  Small  NIB  values  maans  that  little 
time  is  spent  in  fetching  instructions,  however, 
the  complexity  of  the  decoding  process  must  also  be 
considered.  T.n  the  case  of  LAX2  va  PDP-11  the  latter 
1  actor  naems  to  be  of  small  importance. 

Oivan  a  physical  implementation  of  an  architec¬ 
ture,  one  would  expect  th*  execution  times  of  pro¬ 
grams  to  be  proportional  to  their  NIB  values, 
llowevar,  the  accuracy  of  this  correspondence  depends 
on  the  homogeneity  of  the  instruction  set,  i  e  the 
degree  to  which  the  instructions  all  "do  the  same 
amount  of  work".  In  particular,  tha  effect  of 
vector  in ut ructions  has  to  be  taken  into  account. 
Like  many  other  high-level  architectures  LAX2  has 
instructions  operating  on  variable  length  data,  in 
particular  strings.  If  such  iterative  or  vector 
instructions  are  used  frequently  and  on  large  data 
items,  then  clearly  the  number  of  interpreted  in¬ 
struction  bits  will  give  a  too  optimistic  view  of 
physical  execution  time. 

To  estimate  this  effect  the  use  of  vector  in¬ 
structions  in  the  test  progreme  was  inveatigatad. 

Thu  programs  KNTAB,  COMPRESS,  and  PUTDEC  make  neg¬ 
ligible  use  of  vector  instructions.  The  sorting 
programs  compare  text  lines  by  msans  of  ths  vector 
instruction  'string  compare*.  With  ths  test  data 
used  tha  svsrags  number  of  iterations  (character 
comparison  steps)  performed  per  such  instruction  is 
ho  low  as  3.  The  relative  frequency  of  the  instruc¬ 
tion  in  5X.  Let  ue  assume,  rather  arbitrarily, 
that  each  iteration  counts  as  thrse  normal,  i  e 


non-vector,  instructions.  Assume  further  that  all 
instructions  are  of  the  same  length  in  bits  -  which 
is  close  to  being  true.  Then  we  arrive  at  a  prolong¬ 
ation  factor  of  1.4  due  to  the  use  of  vector  in¬ 
structions.  That  is,  to  get  a  more  realistic  measure 
of  expected  execution  time,  add  40Z  to  the  results 
in  Table  3  in  the  case  of  the  sort  programs. 

The  non  optimized  FIND  program  makes  less  fre¬ 
quent  use  of  vector  instructions.  The  optimized 
FIND. OPT: MATCH,  however,  uses  the  substring 
searching  instruction  'part'.  For  literal  string 
patterns  (group  1)  we  find  the  relative  frequency 
of  'part'  to  be  21  and  the  average  number  of  ite¬ 
rations  to  be  close  to  50.  This  gives  a  prolonga¬ 
tion  factor  of  close  to  4. 

These  findings  correlate  well  with  the  results 
of  Table  3  but  do  not  fully  account  for  the  high 
superiority  of  I.AX2  in  the  cases  discussed.  The 
remaining  cause  seems  to  be  that  the  PDP-11  versions 
are  more  heavily  burdened  with  subroutine  linkage 
than  the  LAX2  versions,  where  certain  subroutines 
have  been  replaced  by  vector  instructions. 

As  mentioned  in  the  Introduction  the  1AX2  machine 
has  been  implemented  us  a  partially  microcoded 
interpreter  on  the  Varian  V73  minicomputer,  The 
volume  of  i he  microcode  is  180  64-bit  words,  and 
the  remainder  of  the  interpreter  consists  of 
approximately  7K  16-blt  words  of  V73  machine  code. 
Thus  the  microcoded  part  is  small.  F.xecution  times 
on  the  two  machines  were  measured  for  the  test  pro¬ 
grams.  The  execution  time  ratio  of  LAX2-V73  to 
VDP-11/45  varieB  from  14  to  0.3  with  6-8  as  typical 
values.  These  figures  are  quite  satisfactory,  con¬ 
sidering  the  usual  slowdown  due  to  software  inter¬ 
pretation.  The  hardware  characteristics  of  the  two 
minicomputers  are  roughly  equal. 

Finally  a  comparison  of  the  present  work  with 
similar  published  investigations. 

a 

Milner  has  evaluated  voulmes  of  Fortran  and 
Cobol  programs  on  Burroughs  B1700  using  language- 
oriented  instruction  sets,  in  comparison  to  IBM 
System  S/360  (and  Burroughs  B3500) .  The  results 
show  improvements  by  a  factor  2  t.o  3,  larger  than 
the  factor  of  about  1.5  for  LAX 2  compared  with 
PDP-11.  This  is  however  not  surprising;  S/360  code 
is  less  compact  than  PDP-11  code  . 

Wortman^  compared  the  Student  PL  Machine  of 
his  own  design  with  the  S/360.  A  large  number  of 
small  student  programs  were  used  as  test  cases. 
Several  dynamic  and  static  measures  were  evaluated. 
The  results  show  a  twentyfold  superiority  for  his 
machine  in  number  of  instruction  bits,  both  in  the 
static  and  dynamic  aante.  However,  it  should  be 
noted,  first,  that  hi*  S/360  program  were  produced 
by  the  standard  PL/I(F)  compiler,  and  secondly, 
that  all  runtime  check*  built  into  his  high  level 
architecture  are  el*o  included  in  the  S/360  ver¬ 
sion*  ■  These  checks  include  the  PL/I(F)  condition* 
'subscript  range ', 'overflow' ,  and  'etringrang*' . 
This  if  in  contrast  with  our  investigation,  where 
such  checks  are  indeed  performed  by  the  LAX2 
machine  but  not  by  the  PDP-11  program  versions. 

Nielsen1,1  compared  e  proposed  high-level 
language  architecture  for  the  SPL  language,  a 
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high-level  language  with  special  provisions  for 
expressing  vector  and  natrix  computations,  with  the 
Honeywell  HDC-701P  aerospace  computer.  The  high- 
level  architecture  versions  of  a  set  of  benchmark 
routines  were  found  to  require  19%  fewer  program 
hits  than  carefully  coded  assembly  language  ver¬ 
sions.  A  timing  analysis  showed  that  the  high-level 
architecture  programs  could  be  expected  to  require 
1 A Z  less  execution  time. 

Tafvelin  and  Wikstrbm1'^  compared  a  proposed 
high-level  language  architecture  for  the  machine 
oriented  high-level  language  Mary  with  IBM  S/360. 

A  set  of  seven  programs  was  used,  with  a  total 
S0360  volume  of  42000  bits.  The  main  result  is  that 
program  size  is  reduced  almost  by  a  factor  of  3. 

This  is  partially  attributable  to  a  sof  isticated 
odressing  scheme  called  "refined  display"  used  in 
their  architecture.  No  dynamic  results  are  given. 

5 

The  work  by  Taneubaura  has  already  been  men¬ 
tioned.  His  EM-1  architecture  shares  several  proper¬ 
ties  with  LAX2  but  does  not  have  the  application 
orientation  of  the  latter.  The  performance  evalu¬ 
ation  he  reports  is  based  on  a  small  amount  of  data. 
All  performance  figures  concern  static  code  size. 
Apart  from  isolated  statements  and  programming 
constructs  he  treats  only  four  small  programs. 

Their  total  size  on  the  PDP-11  is  3776  bits  and  on 
EM-1  47%  of  this  figure. 


Conclusion 

The  reported  work  has  given  yet  another  example 
of  the  superiority  of  high-level  architecture, 
designed  from  language  and  application  cornud-rii" 
lions,  over  conventional  architecture.  The  evalu¬ 
ation  was  partial  -  the  only  examined  properties 
were  program  volume  und  number  of  interpreted  in¬ 
struction  bits.  These  quantities  were  evaluated 
using  a  set  of  complete,  realistic  programs  from  a 
wdl-known  source  . 

The  following  features  contribute  significantly 
to  the  shown  superiority  of  the  high-level  archi¬ 
tecture  : 

-  efficient  subroutine  support 

-  structured  memory,  short  addresses 

-  application  oriented  data  types  and  operations. 

As  stated  in  the  Introduction  the  goals  for  the 
LAX2  design  include  low  cost  for  software  produc¬ 
tion.  The  high-level  architecture  supports  this 
goal  by: 

-  eliminating  concepts  from  low-level  programming 
such  as  registers,  primitive  addressing, 
pointer  arithmetic,  and  goto  statements 

-  easing  the  compilation  process  (the  basic  com¬ 
piler  is  available  as  a  machine  instruction) 

-  providing  extensive  run-time  protection. 

We  are  convinced  that  these  properties  signifi¬ 
cantly  promote  programmer  productivity  as  well  as 
the  reliability  of  the  software  produced.  The  con¬ 
tinuing  riae  of  the  ratio  of  software  cost  to  hard¬ 
ware  cost  emphasizes  the  importance  of  such  "soft" 
advantages  of  high-level  architecture.  Unfortuna¬ 
tely  they  are  hard  to  quantify.  To  do  so  for  LAX 2 
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would  require,  in  the  first  place,  more  practical 
experience  with  the  machine  than  is  available  today. 
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Appendix  A 

E1AX  Instruction  Suoaary 

01  the  256  available  byte  value*,  the  onee  in  the 
upper  hall  are  reserved  for  producers  end  locators 
(byte  values  in  huxadeciwal) : 

80-9E:  producers,  st/ack  variables 
AO-BE:  producers,  own  variables 
CO-DE:  locators,  stack  variables 
K 0 — 1*' ii i  locators,  own  variables. 

A  producer  pushes  the  value  of  a  variable  on  the 
stack.  In  the  case  of  a  composite  value,  only  its 
head  is  pushed. 

A  locator  locates  the  place  of  a  variable  and  initi¬ 
ates  a  locator-sequence.  The  latter  is  composed  as 
described  by  the  regular  expression 
locator  pursuer*  (catcher  t  effector) 

The  opcodes  used  for  pursuers,  catchers,  and  effec¬ 
tors  are  in  the  interval  00-2F,  and  the  same  op¬ 
codes  are  also  used  for  other  instructions.  This 
is  possible  since  the  some  instruction  cannot 
occur  both  within  and  outside  a  locator-sequence. 

Pursuers  enable  remote  accessing.  The  three  main 
pursuers  are: 

'Pcomp':  The  located  value  must  be  a  string  (,real- 
nrrny)  or  node  v.  An  operand  l  of  type  index  is 
required  (on  the  stack).  The  i'th  component  of 
v  becomes  located. 

'Hirst':  The  located  value  mutt  be  a  string  (.real- 
array)  or  node  v.  The  first  component  of  v  be¬ 
comes  located. 

'I'own':  The  located  value  must  b*  a  prog  p.  An  ope¬ 
rand  i  of  type  index  ia  required.  The  i'th  own 
variable  of  p  becomes  located. 

Catchers  push  a  value  on  the  stack.  The  three  main 
catchers  are.  'Ccomp',  'Cfirst1,  and  'Cown',  cf  the 
producers  above.  The  value  produced  is  that  of  a 
component  or  an  own  variable,  respectively. 

Effectors  are  categorized  as  basic  effectors, 
string  effectors,  and  special  effectors.  The  basic 
effectors  are: 

'clear':  writes  the  index  value  0. 

'scratch':  writes  the  value  nil . 

'store':  writes  a  value  popped  from  the  stack. 

'plus 'minus'  (only  for  index  values) :  adds,  reap. 

subtracts,  a  value  popped  from  the  stack. 

'  inc.r ' ,  'deer  '  (located  value  mutt  be  index):  incre¬ 
ments  by  1,  resp.  decrements  by  l. 

ilef. ora  summarizing  the  string  effectors  some  other 
classes  oi  instructions  will  be  treated. 

Coi_:_s t_an l. s  are  instructions  pushing  a  value  described 
by  the  instruction  itself  on  the  stack. 

Index  constants:  Values  0-10  are  represented  by  the 
byte  values  00-0A.  Values  11-255  art  represented 
by  two-byte  instructions.  Values  256-16383  are 
represented  by  three-byte  instructions. 

Character  constants:  Represented  by  two-byte  in¬ 
structions,  where  the  second  byte  contains  the 
character  code. 


The  constant  nil  and  the  boolean  constants  true 
and  false  have  one-byte  representations. 

String  constants:  The  empty  string  has  a  one-byte 
representation.  Other  strings  have  a  (n+2)-byte 
representation,  where  the  first  byte  is  an  opcode, 
the  second  contains  n,  and  the  remaining  bytes 
the  character  codes  of  the  string  (l«n4255). 
Decimal  constants:  See  ref.  (2). 

(Real  constants:  Planned,  see  ref.  (1).) 

The  remaining  data  types  (see  Fig.  1)  have  no  con¬ 
stants  . 

The  instruction  class  computers  contains  instruc- 
ctions  taking  a  number  of  values  from  the  stark 
and  producing  a  value  on  the  Btack.  These  Instruc¬ 
tions,  like  the  constants,  are  side-effect-free. 
Subclasses  of  computors  include  binary  operators, 
unary  operators,  binary  predicates,  unary  predi¬ 
cates,  converters,  and  creators.  All  computors  have 
a  one-byte  representation. 

The  binary  operators  are  '  +  ',  '/',  and 

'modulo'.  They  are  defined  for  boolean,  index, 
decimal  (and  real)  operands. 

The  unary  operators  are  'negate',  defined  for  boo¬ 
lean,  decimal  (and  real)  operands,  and  'abs',  de¬ 
fined  for  decimal  (and  real)  operands,  ('truncate' 
and  'round'  are  planned  for  reals.) 

The  biliary  predicates  are  'same',  Miff'  for  com¬ 
parison  ol  heads  of  composite  values,  and  six  rela¬ 
tional  predieutes,  del iued  for  operands  of  types 
boolean,  index,  decimal  (,  real),  and  character  and 
string.  -  Hare,  as  with  most  other  instructions, 
a  character  is  regarded  us  a  string  of  length  one, 

The  unary  predicates  are  'letter'  and  'digit'  fur 
character  operands,  and  'bad',  yielding  true  if  and 
only  if  its  operand  in  ni 1 ,  and  'good'  -  the  nega¬ 
tion  of  'bad '  . 

Converters  convert  from  one  data  type  to  another. 

In  essence,  direct  conversion  is  possible  between 
chaiacter  and  index,  between  index  and  decimal 
(,  between  decimal  and  real,  and  between  index  and 
real).  The  converter  'length'  produces  the  length 
of  a  string,  node,  decimal,  prog  (or  realarruy). 

The  creators  create  a  new  composite  value  (head  on 
stack,  body  on  heap).  They  are  'create',  to  create 
a  string  or  node  (or  realarray)  of  specified  length, 
'copy',  to  produce  a  copy  of  a  composite  value, 
'substring',  to  produce  a  substring  from  specified 
positions  in  a  string,  and  'cat',  to  produce  the 
concatenation  of  two  strings. 

The  string  effectors  have  the  following  in  common: 

-  The  located  value  must  be  an  index  v. 

-  At  least  one  operand,  a  string  S....8  ,  is 
required. 

-  v  must  be  less  than  n. 

The  effector  treats  the  string  segment  sv+j.,,sn  and 
will  normally  Increase  the  value  of  v  ts  a  side 
effect.  The  aim  has  been  to  enable  convenient 
and  efficient  sequential  processing  of  strings. 

The  string  effectors  ire  categorized  as  predicate 
effectors,  pass  effectors,  locate  effectors, 
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get  effectors,  anil  put  effectors.  Descriptions  of 
tlie  individual  instructions  can  be  found  in  ref. 

(1).  Here  we  can  only  offer  an  enumeration  of  them; 
hopefully  their  names  give  some  hints  of  their 
meanings . 

Predicate  effectors: 

'prefix',  'part',  'subequ' 

Pass  effectors: 

'pass',  'paalet',  'pasdig',  'pasletdig' 
borate  effectors: 

'locate',  'loclet',  'loedig',  'locletdig' 

Get  effectors  (get  value  from  string): 

'getindex',  'getchar',  'getident'.  'get  dec', 
('getreal',)  'getstring' 

Put  effectors  (put  value  into  string): 

'putnext',  'purpart' 

The  next  instruction  class  of  interest  is  the  jumps . 
All  jumps  are  generated  from  high-level  control 
structures  during  the  TLAX-ELAX  compilation.  These 
include,  in  short: 

if  -  then  -  else  :  generates  forward  jumps 
case  :  generates  jump  table,  an  indexed  jump, 
and  forward  jumps 

do  -  od^  :  generates  a  backward  jump 
exits  from  do-od :  generate  forward  jumps. 

In  addition,  the  constrol  structure  suggested  by 
Zahn  (C  T  Zahn,  A  control  statement  for  natural 
top-down  structured  programming,  Programming  Symp. 
Proc.  1974  (Ed:  B  Robinet),  Springer,  170-180) 
is  implemented  in  EAX2. 

All  jump*  are  within  progs  and  relative;  distances 
are  coded  in  one  or  two  bytes.  In  total  22  opcodes 
are  allocated  to  jumps. 

Additional  instructions  controlling  the  flow  of 
computation  are: 

'exec',  'return':  for  ordinary  procedure  (prog) 
activation, 

'  in i L ' ,  'attach',  'detach',  'resume',  'call':  used 
in  connection  with  coroutines  (coprogs). 

'exit':  for  abandoning  the  current  computation 
and  reinitialization  of  the  EAX2  process. 

I.AX2  supports  frequency  measurements  during  exe¬ 
cution.  So-called  counters  can  be  placed  at  arbit¬ 
rary  points  in  programs;  they  are  (if  enabled) 
automatically  incremented  each  time  they  are  passed 
during  execution.  Instructions  exist  for  operating 
the  counters. 

Pixprograms  arc  protected  programs  created  at  the 
initialization  of  a  EAX2  process.  Some  of  them  are 
automatically  activated  by  different  runtime  error 
events.  There  are  also  instructions  for  activating 
1 ixprogroms  from  other  programs. 

The  'compile'  instruction,  invoking  the  Tl.AX-KEAX 
compiler  ,  is  implemented  partially  as  hidden 
LAX2  programs.  There  exist  special  ELAX  instruc¬ 
tions  only  available  to  these  programs,  cf  ref. 

(2). 

A  set  of  input/output  instruction...  is  uesrriln’d 
in  ref.  (2). 
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Abntrac.t 

Il)n  complexity,  in  space  and  in  time,  of 
directly  interpreting  aerial,  block  structured, 
high  level  language*  is  examined.  On  the  basis 
U  thin  ntudy,  it  is  apparent  why  it  is  undesir¬ 
able  to  directly  interpret  high  level  languages. 
A  nys tematic  procedure  is  developed  for  the 
denl.gn  of  wall-matched  intermediate  languages 
'oi  supporting  high  level  languages, 


With  the  steadily  increasing  emphasis  upon 
Hie  construction  of  structured,  relieble  and 
iiiuinfnirinble  software,  the  trend  le  toward  the 
use  of  suitable  high  level  languages  (HLLs)  in 
preference  to  machine  or  aseambly  level  lan- 
giuigeo .  The  computer  architect  thus,  is  faced 
with  the  tusk  of  designing  a  conducive  environ- 
niuiit  for  the  execution  of  HLL  programs.  This 
is  ii  shift,  in  perspective  at  least,  away 
liom  the  traditional  role  of  the  computer 
.1  voh I  toef. ;  no  longer  ie  it  appropriate  to 
approach  the  design  task  St  the  machine  language 
level . 

One  viewpoint  advocates  the  direct  interpre¬ 
ts!:  ion  of  the  HLL  program,  by  a  Interpreter 
implemented  in  cither  hardware,  software  or 
f  I  tinware ,  c.g.,  [1,2,3].  The  problems  associated 
with  such  direct  interpretation  have  been 
sketched  in  previous  work  [4,5]  and  will  be 
elaborated  upon  in  this  paper  to  demonstrate  the 
general  undesirability  of  this  approach.  Thus, 
it  will  be  shown  that  meet  HLla  are  not  directly 
interpretable  by  the  apace-tlme  criteria  that 
(ire  developod  subsequently. 

The  alternative  is  to  translate  the  HLL 
program  Into  an  lntarmadiate  representation  that 
is  directly  interpratabla .  Such  an  intermediate 
language  is  termed  a  directly  interpretable 
language  (DILI  F5l •  Currently,  the  DIL  most 
frequently  used  is  the  machine  language  of  sn 
available  computet.  Unfortunately,  all  too 
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often,  the  machine  language  has  not  been  designed 
with  the  given  HLL  in  mind  leading  to  significant 
inafficlencles  in  time  and  space.  It  is  of 
Interest,  therefore,  to  understand  and  form* lize 
the  design  of  a  DIL  that  is  well  matched  to  a 
given  HLL  and  the  relationship  between  the  two. 
Such  a  DIL  could  then  either  constitute  the 
instruction  set  architecture  of  a  machine 
dedicated  to  that  HLL,  or  could  ba  interpreted 
by  a  universal  host  machine  (UHM1 ■  l.e.,  a 
machine  which  can  interpret  any  DIL  with  equal 
and  relatively  little  difficulty.  This  paper 
preeante  some  preliminary  results  relating  to 
the  properties  of  HLLs  that  disqualify  them 
from  being  DlLs,  the  relationship  between  well- 
matched  HLLs  and  DlLs  and  the  process  of 
designing  s  Dll,  for  a  given  HLL.  Identifying 
the  essential  characteristics  of  the  universe 
of  DlLs  cluarly  is  valuable  in  determining  the 
architecture  of  universal  host  machines. 

The  primary  motivation  behind  the  search 
for  an  ideal  DIL  is  the  dosire  to  optimise  the 
space-time  requirements  of  the  interpretation 
process.  A  secondary  goal  it  to  facilitate 
the  compilation  process.  Some  interesting 
space-time  measures  and  analyses  of  "ideal" 
lntermsdlatt  languages  have  been  developed  by 
Hoeval  and  Flynn  [6] .  In  this  paper  an  attempt 
is  made  to  approach  the  design  of  DlLs  in  a 
systematic,  top-down  fashion  with  no  assumptions 
as  to  what  the  end-product  should  look  like. 
Instead,  it  is  dictated  by  a  systematic  method¬ 
ology  that  accepts  as  input  a  description  of 
the  HLL  and  is  guided  by  current  technological 
limitations. 

The  DIL  design  will  be  effected  in  this 
paper  by  considering  the  issues  and  problems 
involved  in  directly  Interpreting  s  HLL.  By 
removing  these  problems  via  a  systematic  trans¬ 
formation  process,  the  target  DLL  will  be 
derived.  Although  no  specific  host  hardware 
descriptions  are  considered  during  the  design, 
such  e  DIL  should  (by  th*  definition  of  a  DIL 
[5])  be  one  for  which  it  Is  technologies lly 
feasible  to  build  a  hardwired  interpreter.  In 
other  words,  it  should  be  possible  to  view  the 
target  DIL  **  s  machine  language  for  a  hypo¬ 
thetical  computer  with  certain  basic,  practically 
feasible  data  and  control  structures.  Such 
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specific  tmplomnUClon  considers! I  on#  will  bo 
discussed  In  on*  of  the  later  sections. 

h  a  ns$si.  of  Interpret* tlon 

In  thl*  faction,  we  fhall  present  a  concep¬ 
tual  model  of  the  process  of  (direct)  interpre¬ 
ts  tlon  of  a  serial  HLL.  Soma  of  the  main  features 
of  the  interpretive  process  will  then  be  illus¬ 
trated  in  terms  of  this  model  and  s  specific 
exampla  high  level  language.  Figure  1  presents 
the  syntax  and  ssamntlcs  for  some  of  the  produc¬ 
tions  of  our  example  HLL.  The  syntax  is  specified 
in  a  context  free  BNP  metanotation;  the  semantics 
corresponding  to  each  production,  are  specified 
in  a  semi-formal  manner.  If  not  originally  so, 
the  source  context  free  grammar  (CFG)  specifi¬ 
cation  is  assusmd  to  hsve  been  converted  to  an 
equivalent  €-free  form.  The  algorithmic  methods 
of  achieving  such  a  conversion  are  well  known  |7] 
and  are  not  discussed  here,  the  I1LL  program  of 
Figure  2  will  be  used  ae  a  working  example. 

Our  conceptual  model  of  interpretation 
draw*  heavily  upon  the  concept*  in  Johnston's 
Contour  Modal  [8]  end  Knuth'a  approach  to 
a  pacifying  the  aesumtice  of  programing  languages 
[9].  It  conaiati  of  four  concurrent,  Interacting 
proceaaaa i 

1.  Lexical  Analysers  Thla  process  is  s  string 
to  string  transducer  which  converts  tho  input 
alphanusmrlc  string  into  a  output  string  of 
tokens  corratpondlng  to  lexemes.  The  function, 
operation  and  complexity  of  thia  proceas  are 
relatively  well  understood  and  will  not  ba 
considered  further  in  this  paper. 

2.  Syntactic  Analyser:  This  phase  of  inter¬ 
pretation  (also  known  ts  parsing  or  recognition) 
is  in  usance  a  string  to  tree  transduction 
process,  where  the  string  of  tokens  emitted  by 
the  lexical  analyzer  is  converted  into  a  (parse) 
tree  using  some  convenient  parsing  strategy. 

3.  Static  Semantic  Analyzer:  This  process  is 
the  one  which  operates  on  tho  tree  being  built 
by  the  syntax  analyzer  by  associating  with  each 
node  the  relevant  semantic  Information  needed 
to  be  able  to  perform  the  actions  called  for  by 
the  program  semantics .  Any  propagation  of 
attributes  (up  and  down  the  tree)  required  to  bo 
performed  in  order  to  fully  epecify  the  attri¬ 
bute*  (and  hence,  the  eemantic  action*)  of  each 
nodo,  ha*  to  be  carried  out  by  thla  analyzer  (8). 
Node*  or  subtree*  deemed  uaeleas  (l.e.  after  all 
relevant  attributes  have  bean  made  use  of  or 
transmitted  to  tha  root  of  the  subtree)  ere 
dlecarded  as  tha  analysis  proceeds.  This  process 
does  not,  itself,  perform  the  actions  indicated 

by  the  program.  It  merely  gathers  the  information 
needed  end  set*  up  the  next  proceas.  All  date- 
independent  action*  that  can  be  performed  by 
analyzing  the  eourc*  program  alone,  arc  in  the 
realm  of  the  static  semantic  analyzer. 


4 .  Dynamic  Semantic  Analyzer:  This  process 
actually  performs  the  semantics  of  the  program, 
by  executing  the  semantic  actions  aaaoclatad 
with  each  node  of  the  tree.  Subtreea  are  dis¬ 
carded  aa  soon  a*  the  reluvant  semantic  actions 
have  been  executed  and  the  attribute*  are  no 
longer  needed  by  the  static  aenmntlc  analyzer. 

I t  is  Important  to  note  that  the  four  pro¬ 
cesses  listed  above  run  in  a  mutually  interlocked 
manner  such  that  each  proceaa  gat*  ahead  of  the 
next  one  in  sequence  only  to  tha  extent  necessary 
for  the  latter  to  operate.  The  controlling  pro¬ 
cess  is  the  dynamic  semantic  analyzer  whose 
action*  are  specified  by  the  statements  following 
the  label  "Dynamic  actions"  in  the  definition  of 
tho  semantics  in  Figura  1.  In  performing  it* 
function,  it  must  make  use  of  certain  attributes, 
termed  S -derived,  which  sre  evaluated  by  the 
static  semantic  analyser.  S-derlved  attributes 
ars  defined  to  be  thoaa  attributes  which  can  be 
derived  by  an  analyala  of  tha  program  text 
(l.e.,  input  data  independent).  The  derivation 
of  these  attribute*  is  specified  in  Figure  1  in 
an  asaertlv*  rathar  than  an  Imperative  manner, 
l.e.,  their  relationship  to  other  attribute*  la 
specified  instead  of  a  sarlas  of  statasmnta  tha 
execution  of  which  would  assign  to  them  their 
correct  value.  The  manner  in  which  they  are 
derived  la  deliberately  left  unspecified.  It  is 
implicitly  understood  that  the  dynamic  semantic 
analyzer  force*  tha  static  semantic  analyzer  to 
procoad  Just  far  enough  that  tha  naadad  S-d*riv*d 
attributes  hava  been  evaluated.  Tha  syntax 
analyzer  haa  a  pointer,  SYN ,  into  the  string  of 
laxamas  emit tad  by  tha  lexical  analyser,  that 
point*  ono  lexeme  beyond  tha  (minimum)  amount  of 
tha  atring  that  the  syntax  analyzer  must  have 
consumed  so  aa  to  aat  up  enough  of  the  syntax 
tree  for  tha  static  semantic  analyzer  to  perform 
it*  function.  The  syntax  tree  la  necessary 
since  the  S-dorived  attributes  srs  necessarily 
defined  in  the  context  of  this  tree.  Generally, 
thu  lexical  analyzer's  pointer,  LEX,  into  the 
alphanumeric  string  will  correspond  exactly  to 
SYN .  Assume  the  dynamic  semantic  analyser  is 
executing  tha  semantics  of  tha  node  labelled 
? Block} i  in  Figure  2.  Thla  requires  knowledge 
of  the  number  of  declarations  in  tho  outeraost 
block.  To  determine  this,  tho  static  semantic 
analyzer  requires  that  all  the  declarations  in 
the  outermost  block  ba  parsed.  Consequently,  LEX 
will  be  at  the  "x"  lasnedletely  following 
"integer  xi". 

Hie  manipulation  of  SYN  and  LEX  is,  by  sad 
largo,  implicit.  In  the  esse  of  loops,  condi¬ 
tionals,  procedure  calls  end  returns,  the 
dynamic  actions  explicitly  alter  LEX  (and 
consequently  SYN)  by  e  statement  of  the  form 
"Perse  (u,v)"  or  "Parse  end  Process  (u,v)"  where 
u  identifies  a  character  in  tha  program  text  by 
its  memory  address  end  v  is  s  non-terminal  which 
serves  as  the  goal  for  the  parser.  In  the  case 
of  procedure  calls,  the  currant  value  of  LEX  is 
saved  explicitly. 
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In  Flgura  1,  attribute*  labelled  q-darlvad 
are  evaluated  by  tha  dynamic  semantic  analyzer. 

An  S-derlved  Attribute  1*  termed  COPIED  11  It  la 
merely  tho  copy  of  an  attribute  alaevhar*.  An 
0 ttrihute  la  INHERENT  If  lta  valua  la  an  Inherant 
property  of  that  nod*.  In  addition,  tha  type 
of  the  attribute  (INTEGER,  REAL,  POINTER,  ate.) 
ere  specified.  Figure  1  clearly  demonstrate* 
the  complexity  of  tha  procedure  call  and  return 
(sue  production*  10  and  24).  Not*  alao  that 
production  21  require*  that  tha  text  to  b*  akipped 
be  parsed,  oven  though  It  will  not  ba  executed, 

Just  to  determine  where  tha  (Stmt)  or  (Slmpatmt) 
ouda . 

3  ■  _Sp*&r  and  Tl—  Jffrqulramsnt* 
for  Interpretation 

The  model  of  Interpretation  developed  In 
tha  previous  section  nay  ba  uaad  to  obtain  a 
qualitative  understanding  of  tha  tlaxs  and  apace 
Involved  In  the  direct  Interpretation  of  HLL*. 
Although,  In  practice,  tha  tree  representation 
would  probably  be  discarded  In  favor  of  a  more 
compact  roproauntatlon  such  as  a  stack,  tha  apace 
occupied  by  the  trao  1*  ralatad  by  a  factor  of 
proportionality  and,  so,  1*  a  good  Indicator  of 
thu  actual  space  requirements .  Tha  advantage 
of  the  tree  reprcaentetlon  Ilea  In  lta  conceptual 
simplicity  which  la  uncluttered  by  extraneoua 
implementation  issue*. 

The  apace  requirement*  are  five-fold: 

(1)  thu  spuce  occupied  by  tha  program  being 
Interpreted;  (2)  that  oaouplad  by  tha  Interpreter; 

(3)  that  roqulrod  to  hold  tha  portion  of  tho  syn¬ 
tax  tree  that  1*  currently  In  axlatance;  (4)  the 
spa cio  needed  to  etore  the  attribute*  associated 
with  the  tree  node*;  (S)  the  apace  occupied  by 
tho  parse  stack  which  cmtelna  terminals  and  non- 
tormina  la  that  have  been  scanned  by  tha  syntax 
analyzer  but  ar*  yat  to  ba  raducad.  (This  la 
nnodod  when  a  bottom-up  parting  scheme  is  used.) 

Iliu  total  computation  tltso  for  tha  Interpreter  Is 
the  sum  of  tha  computation  times  for  tha  Individ¬ 
ual  procuaaea. 

An  obvious  way  of  reducing  tha  aize  of  tha 
program  being  interpreted  1*  to  replace  the 
»  I phs numeric  string  representation  of  lexemes  by 
more  efficiently  encoded  blt-atrlngs  during  a 
pro- processing  step.  As  a  result,  the  lexical 
unn  lysis  process  would  be  eliminated  from  the 
Interpreter  thereby  reducing  the  interpretation 
time.  On  tho  other  hand,  no  longer  would  one  be 
Interpreting  the  original  HLL  directly;  instead, 
s  o lusoly  related  language  would  be  tho  object 
of  Interpretation.  In  this  manner,  by  identi¬ 
fying  the  problems  associated  with  the  direct 
Interpretation  of  the  original  HLL  and  by  modi¬ 
fying  the  HU.  only  to  the  extent  abaolutely 
necessary  to  remove  these  problems,  one  obtain*  a 
language  that  la  as  closely  ralatad  to  tha  orig¬ 
inal  ns  possible  while  possessing  the  property 
of  being  directly  Inter pretable .  Pragmatically, 

«  language  will  be  coneldered  to  be  directly 
Inturprotable  If,  In  the  context  of  currant 
technology  and  cost-function*,  it  Is  feasible  and 
deal  ruble  to  directly  Interpret  tho  language  In 
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comparison  to  alternative  strategic*,  thus,  the 
damarcation  between  language*  which  ara  and  are 
not  directly  tnterpretabl*  la  vague  at  beat  and 
may  be  expected  to  change  with  time. 

The  space  occupied  by  the  Interpreter  is 
related  to  its  complexity.  The  dynamic  semantic 
analyser  la  central  to  the  interpreter  and  can, 
at  beat,  be  made  more  efficient  but  cannot  be 
eliminated.  A*  shall  be  shown  subsequently,  the 
static  aemantic  analyzer  and  the  syntactic 
analyzer  can  be  eliminated  by  suitably  modifying 
the  language. 

The  apace  requirement*  for  the  ayntax  tree 
art  beat  minimized  by  reducing  the  *au>unt  of  the 
tree  that  1*  in  existence  at  any  one  time.  This 
correspond*  to  those  nodes  that  have  not  yet 
beer,  processed  and  dlecarded  by  the  dynamic 
semantic  analyzer.  Where**  the  objective  must 
be  to  prevent  tha  ayntax  analyzer  from  getting 
far  ahead  of  the  dynamic  semantic  analyzer 
(to  minimize  tha  elze  of  the  tree  preaent),  there 
are  factors  that  will  prevent  the  realization  of 
thla  goal;  there  are  occasion*  when  the  dynamic 
semantic  analyzer,  to  perform  its  function, 
require*  information  (attributes)  that  tha  static 
semantic  analyzer  can  provide  only  by  looking 
ahead  In  the  tree,  which  in  turn  require*  that 
tha  ayntax  analyzer  have  proceeded  far  anough 
ahaad.  Tho  language  must  be  altered  to  remove 
auch  situations,  These  modifications,  by 
roducing  the  size  of  the  tree,  also  reduce  tho 
total  number  of  attributes  that  must  be  stored 
and,  conaaquantly ,  the  amount  of  space  needed  for 
this  purpose. 

The  fifth  ap ace  requirement  depends  upon  the 
parsing  strategy  that  is  selected  (or  imposed 
by  the  graonar  apacif icatlon) .  The  two  broad 
clasaa*  of  parsing  technique*  are  the  top-down 
end  the  bottom-up  method*.  Moat  parsing  strat¬ 
egies  can  ba  viewed  a*  either  on*  or  the  other 
or  a  hybrid.  With  the  top-down  technique,  the 
production  to  be  used  Is  known  when  the  ayntax 
analyzer's  pointer  Into  the  string  corresponds  to 
the  left  most  terminal  of  that  production  (with 
an  optional  look  ahaad  of  k).  Tha  Input  tokens, 
therefore,  may  be  consumed  and  acted  upon  as 
they  are  encountered  since  their  syntactic 
significance  Is  defined  *4ien  they  are  first 
encountered.  In  contrast,  bottoo-up  techniques 
know  which  reduction  is  to  be  applied  only  when 
the  syntax  analyzer's  pointer  is  at  the  token 
which  corresponds  to  the  right  most  terminal  ol 
the  corresponding  production  (once  again,  with 
an  optional  look  ahead  of  It) .  In  general,  there 
will  exist  a  number  of  terminals  (and  non¬ 
terminals)  whose  syntactic  significance  has  not 
yet  been  established  (since  the  corresponding 
right  handles  have  not  yet  been  encountered),  but 
which  have  been  already  scanned  by  tha  syntax 
analyzer.  Space  is  needed  to  store  these  items, 
generally  In  the  form  of  t  stack.  From  this 
point  of  view,  a  gramaar  suited  to  top-down 
parsing  Is  indicated. 

With  respect  to  interpretation  time,  there 
is  little  that  can  be  dona  to  minimize  the  time 
required  by  thu  dynamic  semantic  analyzer  beyond 
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eliminating  Inefficiencies  elnce  the  algorithm 
embedded  in  the  program  muet  bo  executed.  The 
amount  of  computation  performed  by  the  atatlc 
semantic  analyzer  la  reduced  If  the  type  of 
attribute  propagation  can  be  matched  to  the  para* 
lng  strategy.  Inherited  (ayntheslzed)  attributes 
can  be  hendlad  easily  with  a  top-down  (bottom-up) 
strategy.  However,  since  both  types  of  attri¬ 
bute*  are  generally  involved,  the  best  approach 
la  to  explicitly  provide  certain  crucial  attri¬ 
bute*  In  the  string,  thereby  implying  a  further 
modification  to  the  language. 

Before  discussing  way*  of  reducing  the  time 
expended  in  syntax  analysis.  It  Is  instructive 
to  catalog  the  various  reasons  for  the  existence 
of  syntax  with  a  view  to  totally  eliminating  the 
syntax  analyzer  If  possible. 

1.  Reliability,  the  major  function  of  syntax 
at  this  point  Is  to  restrict  the  user  to  a 
set  of  strings  that  are  meaningful  to  the 
language  processor. 

2.  Readability. 

3.  to  resiove  static  semantic  ambiguities.  The 
procedure  for  deriving  attribute*  is  defined 
In  the  context  of  tha  syntax  tree  which  must , 
therefore,  be  derived. 

4.  To  remove  dynamic  semantic  ambiguities. 

Often  the  dynamic  semantics  of  csrtein  con¬ 
structs  ere  defined  by  the  syntax  tree, 
e.g.,  precedence  relatlonihipe  binding 
operands  Co  operator* . 

3.  To  permit  an  efficient  parsing  strategy. 

In  the  case  of  a  HLL,  all  of  theae  points 
ere  Important  and  the  ayntax  cannot  ba  ignored; 
nor  can  tha  ayntax  analysis  be  eliminated.  If 
tha  amphaale  Is  placed  on  the  last  laaue,  that 
of  an  efficient  parsing  strategy  to  reduce  the 
Interpretation  tics,  then  It  may  be  necessary, 
a*  we  shall  see,  to  sacrifice  some  readability. 

We  shell  do  ao  to  obtain  a  "hlgh-lsh  level"  DIL. 

On  tha  other  hand,  If  we  are  Interested  in 
a  related  "low  level"  DIL,  i.e.,  one  which  is 
compiled  Into  and  than  interpreted  but  never 
directly  prograened  In,  then  only  issues  3 
through  3  are  relevant.  Readability  la  clearly 
unimportant  and  reliability  Is  guaranteed  elnce 
the  compiler  will  not  pees  any  Illegal  programs. 

If  we  further  perturb  the  language  so  that  the 
semantics  are  defined  independently  of  the 
syntax,  than  syntax  analysis  is  rendered  useless 
and  may  be  discarded  altogether,  the  Interpreter 
may  now  recognize  a  degenerate  grammar  (one  with 
very  few  productions)  which  eiaentlally  permit* 
any  atrlng  of  terminals.  the  ayntax  analysis  for 
such  a  graommr  consists  nmvely  of  checking  for 
illegal  terminals. 

Both  tha  hlgh-lsh  lsvel  OIL  and  the  low 
level  OIL  ere  closely  related  to  the  original  HLL 
by  virtue  of  tha  systematic  transformations 
that  are  listed  in  the  next  section,  the  former 
OIL  may  b«  viewed  ee  a  substitute  for  the  HLL 
if  e  directly  lnterpretablu  HLL  le  deemed 
essential,  the  latter  DIL  Is  beat  viewed  as  a 
well  matched  intermediate  language  for  tha  HLL 
It  1*  clear  that  a  number  of  DlLs  nay  be  defined 
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that  are  Intermediate  between  theae  two  DlLs. 


4.  A  Design  Methodology  for  Directly 
Interpretable  Language* 


In  the  context  of  the  previous  discussion, 
the  following  sequence  of  modifications  (on  the 
high  level  language)  may  ba  uaed  lo  arrive  at  a 
directly  interpretable  language:- 
(a)  Distinct  syntactic  tokens  or  left  handles 
(represented  by  underscored  Integers  In  this 
paper,  e.g.  i,3)  ere  inserted  to  all  production 
right-hand  sides  (Figure  3).  this  makes  ths 
g reamer  LL(1),  thus  simplifying  th*  top-down 
syntax  analysis  phase. 


In  practice,  all  productions  would  not  have 
distinct  left  handles;  only  the  productions 
corresponding  to  th*  earn*  non- terminal  need  have 
distinct  left  handles.  This  would  drastically 
reduce  the  number  of  syntactic  token  needed  to 
six.  However,  In  tha  Interests  of  clarity,  we 
shall  retain  this  redundancy.  No  changes  to  the 
semantics  are  called  tor  aa  a  result  of  this  stap. 

(b)  Each  production  right  hand  side  is  use- 
ordared.  In  accordance  with  tha  sequence  of 
semantic  a peel fleet Ion*  attached  to  that  produc¬ 
tion,  l.e.,  the  terminals  and  non-terminals  are 
placed  In  the  same  order  in  which  they  are  uaed. 
Figure  4  showe  the  productions  effected  by  this 
step. 

(c)  Semantic  tokens  (integers  with  ovsrscorss: 

~,  A  or  v)  are  introduced  at  selected  points  In 
the  productions  to  Indicate  the  Med  for  semantic 
actions.  Of  these,  the  first  type  of  tokens 
(a,g,  T5)  calls  for  semantic  action(s)  which  can 
be  performed  without  reference  to  a  propagated 
attribute.  Such  tokens  can  thus  be  s canned  and. 
Immediately  acted  upon.  The  second  type  (e.g.  3) ) 
references  an  attribute  that  1*  propagated  from 

a  node  which  la  to  the  right  In  tha  tree  (right- 
to-left  attribute  propagation),  while  the  third 


type  (e.g.  6)  use  an  attribute  obtained  from  tha 
left  ( laf t-to-rlght  attribute  propagation). 

Figure  3  Illustrates  tha  affect  of  applying  this 
step  to  tha  selected  productions. 

(d)  The  second  and  third  types  of  ssamntlc 
tokens  (marked  A  and  v)  are  replaced ,  In  each 
case,  by  a  token  of  the  flrat  kind  (marked  ~) 
followed  by  an  explicit  attribute,  (e.g.  (numb)), 
thereby  eliminating  the  need  to  propagate  attri¬ 
butes  at  Interpretation  time.  In  the  laet  two 
steps  a  number  of  redundant  saMntic  tokens  have 
been  defined  to  enhance  clarity.  In  practice, 
thie  reuundancy  would  be  elimlMted. 

(e)  All  the  original  tarmlMl  symbol*  (s.g. 
begin,  end  etc.)  are  deleted  from  the  language 
and  the  grsaamr.  Theae  symbols,  It  may  be  noted, 
are  totally  redundant  at  this  point,  both  syntac¬ 
tically  and  semantically. 


Tha  final  form  of  tha  DIL  graaomr  at  tha  and 
of  steps  (e)  through  (s)  Is  shown  In  Figure  6. 


It  1*  to  be  noted,  In  summary,  that  our 
newly  derived  language  (DIL)  hat  the  following 
desirable  properties: 

1)  Top  down  LL(1)  parting  (with  no  beck 
track)  le  possible.  Thus  syntax  analysis 


in  •imple. 

2)  close  tracking  between  the  three  inter¬ 
pretation  subproceetee  Is  possible, 
resulting  In  minimum  tree  storage  re¬ 
quirements  and  overall  speedup  in  the 
semantic  analysis  phase. 

1)  Due  to  the  closely  matched  HU.  and  Dll. 
gratrenars,  a  simple  syntax-directed  trans¬ 
lation  scheme  (SDTS)  [10]  may  be  adopted 
for  the  translation  phase. 

It  Is  to  be  noted,  that  minimizing  the  space 
requirement  for  holding  the  DIL  program,  has  not 
really  been  considered  in  listing  the  modifica¬ 
tion  steps.  However,  one  might  guess  that  the 
price  paid  (in  terms  of  increased  program  size) 
for  achieving  the  advantages  listed  above  is 
uecuptub  I u . 

The  language  that  we  have  just  doiived  naiy 
be  used  as  a  high  level  language  In  which  pro¬ 
gramming  may  be  performed  if  the  lexemes  are 
represented  a lphanumerica 1 ly  and  the  tokens  arc 
represented  by  keywords,  ’this  will  require  the 
i etntroduction  of  the  lexical  analyzer.  Hie 
most  unacceptable  feature  of  this  language  lies 
In  having  to  explicitly  specify  the  number  of 
lexemes  that  have  to  be  branched  over.  Thu  use 
of  labels,  while  making  the  language  marginal  lv 
acceptable,  would  require  the  equivalent  ol  a 
one-and-n-half  pass  assembly  phase.  Hie  lan¬ 
guage.  would  no  longer  be  directly  interpretuble . 

if  we  desire  a  language  that  is  to  be  used 
merely  to  be  complied  into  and  then  directly 
interpreted,  wu  can  continue  the  transformation 
process  further.  Since  the  need  for  attribute 
propagation  by  the  static  semantic  analyzer  is 
no  longer  present,  syntax  analysis  at  this 
point  is  needed  only  for  checking  the  syntactic 
correctness  of  the  program.  If  the  DIL  is  not 
to  bo.  used  for  direct  programing,  syntactic 
chocking  is  unnecessary,  since  any  errors  would 
have  been  detected  during  the  translation  phase. 
Adopting  this  point  of  view,  we  may  proceed 
to  delete  all  tokens  which  are  purely  syntactic 
(i.e.,  tokens  that  are  only  underscored)  from 
tliu  DIL,  grammar  of  Figure  6.  The  result,  now 
truly  resembles  an  "assembly”  language,  in  that 
the  program  consists  of  a  sequence  of  semantic 
tokens,  or  "op  codes".  Figure  8  shows  the  pro¬ 
gram  with  numerical  tokens  replaced  by  a  i  jd»u  - 
betic  mnemonics.  The  simplest  gruinnar  that 
will  accept  programs  in  this  "assembly"  language 
is  the  trivial  grammar  shown  in  Figure  7,  since, 
the  absence  of  syntax  checking  Implies  that  any 
sequence  of  semantic  tokens  is  acceptable)  to 
the  interpreter,  even  if  semantically  meaning¬ 
less.  If  the  interpreter  is  based  on  tills 
gratimar,  the  syntax  analysis  process  becomes 
degenerate.  Hie  granmar  of  Figure  6  (after 
deleting  purely  syntactic  tokens)  is  needed, 
nevertheless,  to  permit  the  translation  of  thu 
IIU,  program  into  the  "assembly"  language  in  a 
syntax-directed  way. 

In  actual  practice,  some  minimum  amount  m 
syntax  chocking  may  be  desirable  oven  nt  the 
"assembly"  language  level,  in  which  ease,  the 
grammar  spec!  I  lent  ion  would  lie  inlurmediiiu 


between  the  two  "extremes"  oi  Figure  6  (full 
syntax  checking  capability)  and  Figure  7  (no 
syntax  checking) . 

5.  Technological  Constraints 
and  Implications 

Various  assumptions  regarding  the  available 
hardware  and  software  technology  have  been 
implicit  up  to  this  point.  These  assumptions 
will  now  be  discussed.  Firstly,  it  is  assumed 
that  the  best  technique  for  the  construction  of  a 
parse  tree  is  through  the  use  of  a  pushdown 
automata.  (Compiler  theory  offers  no  better 
alternative).  Hence,  syntax  analysis  will 
m'eessnri  ly  be  t  ime-eonsumlng  unless  the  gnipoiuti 

is  u. :d. 

If  is  assumed  that  the  large  scale  use  ol 
associative  memory  will  not  be  cost-effective  or 
acceptable .  Hence,  information  must  be  repre¬ 
sented  by  data  structures  that  support  scarclilui  . 
For  instance ,  the  association  of  an  identifier 
reference  to  the  corresponding  declaration 
(to  obtain  attributes)  would  clearly  bo  facili¬ 
tated  by  the  use  of  associative  memory,  in  the 
absence  of  associative  memor.-,  this  information 
must  he  maintained  In  data  structures  (hash 
(allies,  linear  lists,  elc  .  )  which  simplify  tin 
search.  Furthermore,  since  such  searches  are, 
at  beat,  relatively  slow,  it  is  preferable  to 
provide  explicit  attributes  in  the  program  which 
convert  the  associativa  search  to  a  well-defined 
look-up  procedure,  in  the  previous  example,  Llie 
identifier  reference  should  be  replaced  bv  two 
attributes  consisting  oi  the  specification 
(relative  to  the  current  contour)  of  the  contour 
containing  thu  variable  and  the  ordinal  number  oi 
l lie  identifier  declaration  amongst  the  set  ol 
dee  la rat  Ion  attributes  attached  to  the  corre¬ 
sponding  liloe',.)  node  (i.e.,  an  address  couple). 

Also,  it  is  not  evident  how  a  tree  structure 
may  he  implemented  In  hurdwaru  whereas  stacks  are 
reudlly  implementab le  either  in  hardware  or  in 
software.  Tims,  whereever  possible,  tree  struc¬ 
tures  must  be  replaced  by  stacks.  Hie  sub-tree 
corresponding  to  -.exp  can  be  supported  by  un 
evaluation  stack.  If  this  Is  done,  the  semantics 
associated  with  curtain  productions  in  the  gram¬ 
mar  must  be  a'Lered  and  he  expressed  in  terms  ol 
stuck  operations,  if  the  block  retention  rul.i. 
of  the  lunguuge  permit  (as  Is  the  case  In  our 
example  language),  the  contour  nudes  muy  he 
mu i ntalued  on  »  contour  Btack  and  thu  associated 
dec  1  a  rati  on  attributes  may  he  allocated  space  un 
allocution  stack.  As  In  the  burroughs 1  IW'Miu 
| 11|,  the  throe  stackB  may  be  combined  (with  a 
slight  attendant  increase  in  complexity). 

h.  Discussion 

Thu  undesirability,  in  space  and  time,  ol 
directly  Interpreting  most  IlLLs  stems  from  the 
need  to  do  syntax  and  stalls  semantic  analyses. 

,  ariuiiH  i.iclors  contribute  to  this  need  and  it 
has  been  shown  how  they  can  be  eliminated  to 
vie  Id  a  diri.tlv  i  ulei  piit.ib  le  lunguuge.  flu-  oil 
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that  Is  obtained  la  not  unique;  two  Dlls,  a  low- 
level  one  and  another  higher  level  one,  "c 
Ov.,l\'  In  this  paper  by  a  systematic  -r«..,*forma- 
tlon  process.  Other  trade-offs,  not  discussed 
in  this  paper,  exist  between  the  size  of  the  DU. 
program,  the  size  of  the  syntax  tree  and  the 
Interpreter* r  .  time,  thus,  a  space  of  Oils  exist 
for  each  Hi.*,,  and  the  one  selected  must  be 
specified  by  further  con* •  Ints  end  crl'oria. 
Alio,  precise  measure*  ice  end  tirr  w.*d 

to  be  developed  to  piece  ,  qua  liter  l>.>  .  .;-s- 

•  lo,is  on  a  quantitative  footing. 

Moat  compilers  have  a  code-optimization 
phase  which  performs  two  functions:  oienine- 
Independent  optimization  and  machine-dependent 
optimization.  TV.  former  consists  of  p-  ,ram 
transformations  which  involve  Wnovleuge  of  the 
OIL  being  complied  Into.  Such  up^lmlzetion  Is 
generally  eelf-dni'eetrug  In  *  HLL  Interpreter 
since  the  coat  of  reported  optimization  out 
weighs  .he  benefits  accrued.  When  designing  a 
DIL  for  e  HLL,  the  presence  of  the  optimization 
phase  in  the  compl1 „r  should  not  b»  ignored 
since  it  can  alter  the  structure  of  the  syntax 
tree  Into  a  directed  acyclic  graph  (e.g.,  a 
comaon  sub-expression' s  trae  may  be  u  sub-tree 
for  a  number  of  nodes).  The  stack,  by  Itself, 
may  not  be  an  adequate  vehicle  for  Implementing 
such  networks.  Machine-dependent  optimization  is 
present  primarily  to  bridge  the  mismatch  eatveen 
the  semantics  of  the  HLL  end  the  machine  language. 
However,  If  the  "machine"  language  designed  to 
match  the  ‘.ILL,  thie  form  of  optimization  may 
prove  unnecessary. 

The  Important  Issue  of  encoding  strategics 
for  Dll  programs  has  not  been  touched  upon  in 
this  paper  and,  10,  program  statistics  for  the 
[ILL  have  not  '  ■'rmed  an  input  to  the  DIL  design 
proceee .  The  encoding  technique  used  can  assume 
/artous  levels  of  complexity.  To  begin  with, 
the  introduction  of  redundant  syntactic  and 
semantic  tokens  should  be  avoided.  Assuming  that 
The  Interpreter  will  run  on  a  machine  that  pro¬ 
vides  for  accessing  arbitrary  length  bit-strings 
[essential  for  a  UHM) ,  tile  tormina  Is  oi  the  Dll, 
should  be  assigned  codes  that  contain  just  enough 
bits  to  differentiate  between  the  terminals  that 
could  have  appeared  at  that  point  In  this 
respect,  the  gramnar  of  Figure  6  is  preferable 
to  that  in  Figure  7  since  it  reduces  the  inherent 
ambiguity  at  each  step.  On  the  other  hand, 
syntactic  tokens  are  now  needed  and  may  cause  s 
nut  increase  in  program  size.  FinaPy,  a 
f requercy-baaed  "needing  scheme  mav  he  employed, 
defined  either  on  the  linear  string  or  on  the 
parse  tree  [12].  The  latter  scueme  will  probably 
do  better,  but  makes  syn'^-'x  analys's  a  nccessitv- 
yct  another  space-time  t  ide-off. 

The  low-level  DIL  that  was  obtained  Is  not 
radios!  is  nature  and,  In  fact,  looks  .|uLtc  simi¬ 
lar  to  a  number  of  stack  architectures.  However, 
the  relationship  between  features  nf  the  1)1  L  ami 
the  HLL  la  now  clearer.  Also,  issues  such  as 
the  Instruction  formats  to  he  used,  whirl  gen¬ 
erally  assume  a  central  position  in  instruction 
vet  design,  fail  out  in  a  natural  "winner  us  a 


result  of  encoding  decisions  and  conform  to, 
rather  than  conatraln,  the  other  aynt~.ct.lc  and 
semantic  requirements  of  tha  DIL. 

In  conclusion,  we  do  not  advocate  the  direct 
Interpretation  of  sophisticated  high  level  lan¬ 
guages  since  there  ire  far  too  many  costly  compu¬ 
tations  involved  that  are  beat  factored  out  and 
performed  just  once  during  a  compilation  phase. 
Instead,  e  well-matched  directly  Interpretable 
language  should  ba  designed  along  the  llnea 
suggested  in  this  paper.  Thereby,  space-time 
saving*  will  be  achieved  and  the  compilation 
proccas  will  ba  facilitated. 

References 

1.  K.  J.  Thurler  and  J.  W.  Myna,  "System  design 
of  a  cellular  APL  computer,"  IEEE  Trans. 

Comp.,  C-19,  4.  1970,  291-303. 

2.  j.  F.  Anderson,  "A  computer  for  direct 
execution  of  algorithmic  languages,”  Proc. 
EJCC.  1961,  184-193. 

3.  H.  M.  Bloom,  "Conceptual  design  of  a  direct 
hlgh-lev*l  language  proceaaor,"  High-Level 
language  Computer  Architect ure.  Y.  Chu  (Ed.), 
Academic  Press,  1975,  187-242. 

4.  L.  W.  Hoevel,  "  'Ideal'  directly  executed 
languages:  an  analytical  argument  for  emula¬ 
tion,"  IEEE  Trans.  Comp. ■  C-23,  8,  1974, 
739-767. 

3.  B.  R.  Rau,  "Level*  of  representation  of  pro¬ 
gram*  and  the  architecture  of  universal  host 
machines,"  Proc.  11th  Ann.  Wk«hp.  on  Micro- 
prog..  1978,  67-79. 

6.  L.  W.  Hoevel  and  M.  j .  Flynn,  "The  structure 
of  directly  executed  languages:  a  new  theory 
of  interpretive  aystem  design,"  Digital 
System*  Lab.  Tech.  Rep.  No.  130,  Stanford 
llnlv.,  March  1977. 

7.  .  E .  Hopyroft  and  J.  E.  Ullman,  Introduction 
to  Automa  a  Theory.  Languages  and  Computation. 
Addison-Wi  sley ,  1979. 

H.  ).  H.  Johnston,  "The  Contour  Model  of  Block 
Structured  Processes,"  SIGPlAN  Notices . 

Vol.  6,  Feb  1971,  52-82. 

9.  D.  K.  Knuth .  "Semantics  of  context-free  lang- 
guagos,"  Math.  Svs.  Theory.  2,2,  1968, 
127-145. 

10.  p.  M.  Lewis,  D.  J.  Kosankranlz  and  R.  E. 
Stearns,  Compiler  Design  Theory.  Addison- 
Wesley,  1978. 

11.  F,.  A.  Ilauck  and  b.  A.  Dent,  "Burroughs' 
B6500/U7500  stack  mechanism,"  ?roc.  SJCC. 

1968,  245-251. 

12.  K.  K.  Sweet.,  Empirical  Estimates  of  Program 
Entropy .  Ih.i).  Dissertation,  Dept  of  Computer 
Science,  Stanford  Univ.,  1976. 


■v 


vt-*1-.1-  *«ld*  THUA-Ie.  r.'di 


'iVLik 


■j 


k 


■1 

" 


-jti 


aiiMjr»juv4L.-  .ar'  'itflfalMi  tiMul 


..JJi 


lss 


P.V.  M  C  • 

&  **  “J  J  5 

♦•  *  g  o 
go*  **f» 

J  "  i  B  0  * 

z  •• • 

•  5  OirtV* 

■S"  s  I 

tt  sra 

uo  ’ 

aim 

^siii 


i;  5 

o  « 


h 

« 

if? 

m  +* 

8  ‘ 


*« 

n 


ft 

•i~ 

£..» 

a* 

•  o  3 


■  o 


SJS  !la 

4*  *»  «  &  * 


“a 


s 

•  <1 

”3-  -  - 

♦  *  o 

•  ^  •  (I,  .► 

afire- 


3 


s| : 

-at 

*  u  *> 

S8S 

£  *a 

is} 

Ip 

“la 

fi|e 

tU 


i 

i 

0 

rt 

u 

H 


8 


r  t 

5  -8  ?  •S  ** 

"•  s  s  f  a  „ 

M  *  £  |3  4 

*«  V  _  V  ^  a 


s*  i1 


A  K 

S*0  . 
£2  3 

i  Im3 


£ 

**  * 

•  - 

a: 

•M  ] 

°  I 

p. 

2 

it 

S' 

5 


§g 

a 


*  ° 


|r-IS3K  Kfc 


RRRKRCRtR  SRlSCt*  K  ItffcBv. 


n  - 

M  ••  /»  ♦■  4>  «* 

Rtf.  t  r  t'tse 


*8  t. 

a  *  -  §  ■■ 
feBR 


■\SfcS 

*«  n.*_  m 

o  s 

Oft/) 

o  n  vi  ♦ 

•H  > 

♦»  *o 

0  ^  •  4>  • 

ava  £5 
~fcE  It 
asru  sa 

V  r-4  4*  t*  ft 

*U  up  U|P, 

a  as  iaav 

♦»  v  ••  •  <-  n  i 

K 


i  S 

0.3 

O  V 
A.  J»  A 

A  M  A 

a  sas 

S  9?c- 

V  A  A  A  A 

vaa  a-asa 

.a  a  s  5  .s  a 

u  O  O  O  o  o  H 

e  e  e  e  t>  t;  E 


ph  - 

EE 


SL 

n  pa 


ii  Ifi 


Sis 


I 


(f; 

a 

w 

0 

in 

v 

«R§ 

fil, 

y  *♦ 


I 


;il ! 

..HM^^HwiHaanaaaatadaapifi 


ULt 

“iSiitt 


0  c 

8  a 

1* 

a  i 

*»  ft 

§’ 

e  •, 
0  a 

H 

•0 

‘I? 


Iv'pj0  „gS3S3« 


5  V  V  V 

a  tiifuai 


§§ 

AAV 

r  r 


i&« 

Be 


h 


ill 

.  V  I  V<  » 
r  A  »*  u  'a 

s  ?  i  ii 
-  •a  * ;  ? 


.  .  E  a  a  ■  a 


8|i 

O  (il  ► 


k  t*  /.  f* 
y  v.rs 


K5>£ 

S|| 


ig  P^4^«m.4fi^3^i3iy3)5ta5i5itaq  m*i 


1 


I  ? 


it  * 

!!  1 


l  ►>  «t  «'  'O  f  IT  O' 


f’-cicststrz'aKin) 


H 

*•  <1 

a  a 

A  u  ?.;  i: 


a. 


8  8 


p.  ;  f 1  r  ?.  s-  •.<■.  1,  .r  ? :;  r, '.-( t,  a  s?  a  ■..  a  ■: 


«i 

5 


62 


’•U'dia  wliJ.i  ,).u  „ 


BBQIR  [2] 

PROC  [49] 

BBJIB  [ij  IBT 

PUSHI+  1 1  ]  ASSIGR  |2,o)  PUSHI*  |o]  ASSIGR  |2,l)  WSHVAL* 

PUSHI+  [0]  TOT  BRFC  [25]  PU8HVAL+  |2,Oj  PUSHVAL-*  |l  ,0)  TRB  BHPC 
[16] 

PUSHVAD+  1 1,1)  PUSHVAL+  (2,0]  ADD  ASSIGR  |l,l)  PUSKVAL+  |2,0) 
PUSHI+  [l]  ADD  A3SIGR  (2,0|  BRBU  [23I 
BHFU  [4] 

BUD  BRETURR 
IBT 

PU3HX*  [lO]  ASSIOB  |o,1 1  CAX.D2  |0,0|  PUSHVAL*  |0,l)  PASSVAt  PU8HADDR 
I 0,1)  PASSADDR 
BBD  HALT 


Figure  8.  "Assembly"  language  program.  Busbars  in  "f  ]M  represent 
literal  valuaat  those  in  "(  ]"  repreeent  address  couples • 
The  lexical  level  of  the  outermost  block  (sain  program)  is 
0,  that  of  the  procedure  la  1  and  the  inner  bloak  Is  at 
lexioal  level  2. 


Footnotes  t  -  (/>  Fij.  6  ») 

W  The  address  couple  has  the  forsat  (lexical  level,  ordinal  nunber  of  variable  in 
the  declaration  list).  For  both  the  nusbring  starts  vith  0. 

T Val-or-loo  is  an  explicitly  propagated  attribute  which  can  assise  one  of  two 
values,  specifying,  reepectively,  whether  the  value  or  the  addreas  of  the 
identifier  ia  required.  Slnoe  this  attribute  oan  assume  only  two  value a,  It  Is 
better  taken  oars  of  by  assuming  two  different  samantio  tokens  (op  oodes)  where 
neoesaaryi  e.g  49V  AL  and  49LOO»  (vide  Figure  7). 
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TWENTY  YEARS  OF  BURROUGHS  HIGH-LEVEL  LANGUAGE  MACHiNES 

E.  Uean  Earnest 

Burroughs  Corporation 
Mission  Viejo,  California 


Abstract 

A  discussion  is  presented  of  several 
computer  systems  developments  over  the  past  20 
years  at  Burroughs  Corporation.  Some  of  the 
system  design  philosophy  and  concepts  employed 
by  the  system  designers  are  included  to  pro¬ 
vide  an  understanding  of  the  motivation  of 
certain  design  decisions. 


The  basic  sot  oi  machine  design  and  use 
concepts  were  first  publicly  discussed  by  Bob 
Barton  in  1961.  The  first  commercial  delivery 
of  a  machine  whose  design  was  based  on  this 
approach  (the  Burroughs  B5Q0D)  was  made  in  the 
early  1960s.  The  concepts  embodied  in  that 
system  have  been  expanded  over  the  past  20  years 
through  insights  made  possible  by  our  accumu¬ 
lated  experience  in  high-level  language  oroces- 
sing  environments. 

A  brief  discussion  is  presented  of  some  of 
the  concepts  and  design  principles  which  have 
guided  Burroughs'  computer  systems  design,  A 
review  of  some  representative  developments  from 
selected  systems  design  projects  is  included 
with  some  of  the  design  and  use  ideas  which  wore 
incorporated. 

General  Concepts  and  Ideas 

Burroughs'  computer  systems  architecture  for 
the  past  20  years  is  a  consequence  of  the  artic¬ 
ulation  of  and  adherence  to  a  relatively  small 
set  of  closely  related  design  concepts  and  Ideas 
Following  are  representative  of  these  tenets: 
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Introduction 


H  Ijh  -_Lev  e  1  Languages 


A  discussion  of  Burrroughs  Corporation's  20 
years  experience  with  high-level  language 
machines  should  be  considered  In  the  context  of 
some  of  the  concepts  and  philosophies  which 
served  to  guide  the  system  designers. 

A  central  theme  which  has  guided  the  devel¬ 
opment  of  computer  systems  for  over  20  years 
at  Burroughs  can  be  characterized  as  follows: 

The  role  of  computer  systems  is  to 
facilitate  communication  between 
people  through  the  amplification  of 
human  capabilities.  Anything  which 
creates  a  distraction  from  the 
achievement  of  this  role  should  be 
regarded  as  being  wrong. 

The  use  of  higher-level  languages  throughout 
Burroughs  computer  systems  is  consistent  with 
that  theme.  The  development  and  evolution  of 
efficient  machine  architectures  to  support 
those  abstract  notations  significantly  facil¬ 
itates  communication. 
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One  of  the  more  important  concepts  Introduced 
with  the  Burroughs  B5000  was  a  dedication  to  the 
use  of  higher-level  programming  notation  to  the 
practical  exclusion  of  machine  or  assembly  lan¬ 
guages.  It  was  proposed  and  demonstrated  that 
a  computer  system  could  be  designed  and  imple¬ 
mented  which  would  provide  a  sympathetic  and 
efficient  host  to  an  exclusively  higher-level 
language  processing  environment. 

At  the  time  of  introduction  of  the  B5000, 
higher-level  languages  were  considered  to  be  of 
limited  practical  value  in  the  real  world  of 
Information  processing.  Their  use  consumed 
vast  amounts  of  resources  (particularly  time) 
for  the  compilation  process. 

The  resource  consumption  for  the  compilation 
process  was  considered  so  severe  that  users 
frequently  abandoned  the  high-level  represen¬ 
tation  of  a  program  after  the  initial  design 
and  an  error-free  compilation.  They  frequently 
completed  the  testing  and  patching  process  In  a 
more  primitive  representation.  They  thereby 
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avoided  solving  the  basic  problem  of  not  having 
an  efficient  language  processing  system.  As  a 
result  of  this  multiple  representation,  the 
operational  program  did  not  resemble  the  initial 
high-level  description. 

In  addition  to  the  problems  with  compilation 
performance,  the  object  programs  executed  sig¬ 
nificantly  slower  than  the  proportedly  equiv¬ 
alent  programs  written  in  lower-level  notations. 

On  contemporary  machines,  both  performance  obser¬ 
vations  were  valid.  The  problems  confronting 
compiler  writers  were  significant--conventiona! 
machines  were  not  designed  to  facilitate  the 
mapping  of  an  abstract  notation  to  the  set  of 
primitive  functions  supported  by  those  machines. 

In  spite  of  these  drawbacks,  higher-level 
languages  achieved  some  acceptance  because  of 
the  now-recognized  advantages  of  their  use  for 
program  design,  implementation,  and  enhance¬ 
ment. 

Since  the  B5000  was  designed  to  efficiently 
handle  programs  written  in  ALGOL  60,  it  was 
natural  to  implement  all  programs,  including 
systems  software,  in  that  language. 18  The  use 
of  higher-level  languages  for  all  progranminq 
was  critical  to  the  success  of  the  entire  pro¬ 
ject.  The  approach  permitted  a  continued 
interaction  and  feedback  among  the  hardware 
and  software  designers,  the  system  Implementors, 
and  the  system  users.  During  the  course  of  the 
B5000  project  and  subsequent  developments,  the 
roles  of  most  of  the  participants  in  the  de¬ 
sign  changed.  Systems  designers  subsequently 
became  software  designers.  These,  in  turn, 
became  software  Implementors  who  are  included 
in  the  population  of  systems  users.  The 
continued,  exclusive  use  of  higher-level  lan¬ 
guages  contributes  to  a  fluency  in  those 
languages.  It  also  provides  strong  motivation 
for  the  development  of  an  efficient  system.  At 
Burroughs,  the  system  users  are  system  de¬ 
signers  and  are  expected  to  contribute  to  the 
hardware  and  software  architectures,  implemen¬ 
tations,  and  enhancements. 

The  viability  of  using  higher-level  lan¬ 
guages,  which  was  demonstrated  on  the  B5000, 
reinforced  Burroughs'  comnitment  to  the  ap¬ 
proach  on  subsequent  systems  designs  and 
program  product  developments. 

It  should  be  noted  that  while  high-level 
languages  have  achieved  a  certain  acceptance 
today,  it  Is  largely  due  to  advances  in 
compiler  technology.  Some  modern  compilers  do 
achieve  an  acceptable  performance  level.  Else¬ 
where  in  the  industry,  machines  are  not  being 
designed  to  facilitate  high-level  languages. 

The  Design  Team 

A  blending  of  technologies  and  experience  is 
required  for  the  design  of  a  commercially  via¬ 
ble  computer  system.  At  Burroughs,  ,1  system 
design  team  typically  consists  of  .1  very  small 
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croup  of  people  from  the  several  necessary  dis¬ 
ciplines.  Each  participant  must,  of  course,  be 
well  qualified  in  a  particular  discipline  and 
must  have  a  good  working  knowledge  in  the  other 
represented  areas.  This  cross-discipline  know¬ 
ledge  is  necessary  for  effective  contribution  to 
the  design  and  implementation  decisions. 

There  has  been  much  written  about  the  inte¬ 
grated  hardware/software  approach  to  systems  de¬ 
sign.  Experience  has  shown  that  it  is  not 
sufficient  to  collect  experienced  people  from 
the  contributing  disciplines.  As  Bobby  Creech 
observed  in  his  paper  on  the  B6500  architecture, 
the  attitude  and  the  personality  of  the  parti¬ 
cipants  are  critical  to  a  successful  system  de¬ 
sign.2  Intelligence,  common  sense,  and  previous 
experience  help  considerably,  but  the  successful 
blending  of  these  three  attributes  require  the 
correctness  of  the  contributors'  attitude  and 
personal  1 ty . 

Design  Scope 

Bob  Barton,  as  Indicated  in  his  1961  paper  on 
a  computer  system  design  approach,  suggests  that 
higher-level  programming  languages  should  be 
employed  for  all  programing  tasks  to  the  prac¬ 
tical  exclusion  of  lower-level  notations.! 
Additionally,  he  believed  that  the  operation  of 
the  computer  system  should  be  under  control  of 
the  system  itself.  This  injection  of  user  and 
operator  perspective  into  the  system  design 
process  Implied  a  much  broader  utilization  of 
high-level  languages  than  had  been  considered 
in  prior  systems.  Contemporary  machines  of  that 
era  attempted  to  Implement  a  higher-level  lan¬ 
guage  in  the  hostile  environment  of  a  machine/ 
assembly  language  system.  To  provide  a  con¬ 
sistent  implementation,  the  design  team  on  the 
B5000  broadened  their  scope  of  responsibility 
to  include  the  entire  programming  and  operation¬ 
al  environment  of  the  system. 

Early  in  the  higher-level  language  system  era 
at  Burroughs,  Lloyd  Turner  and  other  software 
team  members  developed  a  particularly  effective 
graphical  representation  of  the  ALGOL  language 
syntax. 3  This  representation  significantly 
clarified  the  language  structure  for  the  team 
and  permitted  new  insight  into  an  effective 
compiler  Implementation.  Additionally,  this 
representation  and  understanding  of  the  language 
permitted  the  definition  of  consistent  exten¬ 
sions  to  the  language  when  other  components  of 
systems  programming  and  operation  were  con¬ 
sidered.  The  entire  software  system  was 
implemented  in  ALGOL  (as  was  the  ALGOL  compiler 
itself).  Since  the  scope  of  the  systems  de¬ 
signers'  responsibility  ehcornpassed  the  entire 
hardware,  programming,  aiid  operational  environ¬ 
ment,  additional  opportunities  were  available 
for  the  partitioning  and  implementation  of 
required  functions,  Commonly  used  functions  as 
well  as  systems  management  algorithms  were 
factored  out  of  the  users  environment  into  the 
operating  system.  Where  appropriate,  these 
functions  were  replaced  in  the  users  environ- 
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merit  by  calling  (naming)  syntax  which  was  consis¬ 
tent  with  the  calling  language.  This  system- 
wide  approach  to  the  use  of  higher-level  lan¬ 
guages  provided  a  natural  environment  for  the 
handling  of  general  systems  functions.  These 
functions  were  represented  by  a  syntax  which 
was  consistent  with  that  utilized  for  the 
systems  software.  This  environment  permitted 
the  development  and  Integration  of  such  in¬ 
novations  as  automatic  memory  management, 
virtual  memory  and  general  file  management 
into  the  operating  system.  A  description  of 
the  results  of  this  pioneering  effort  is  In¬ 
cluded  In  the  B5500  Master  Control  Program  des¬ 
cription.15 

The  conmltment  and  the  adherence  to  the  ex¬ 
clusive  use  of  higher-level  languages  through¬ 
out.  the  system  produced  a  systems  software  and 
usage  base  which  could  be  readily  enhanced. 

The  interface  between  cooperating  software 
modules  implied  by  the  consistent  use  of 
higher-level  abstractions  permits  new  functions 
to  be  easily  integrated  Into  the  software 
system.  This  abstraction  also  allows  software 
systems  to  be  propagated  over  several  gener¬ 
ations  of  hardware.  Software  subsystems,  such 
as  the  Network  Definition  Language,  4  the  Data 
Management  Languages,5  and  augmented  operation¬ 
al  dialogues  which  have  been  implemented  over 
the  post  several  years  have  been  guided  by  the 
global  perspective  suggested  by  Barton  and 
enhanced  by  subsequent  software  teams. 

General  Design  Principles 

The  preceedlng  discussion  suggests  that  the 
recognition 'of  and  adherence  to  a  closely 
interrelated  set  of  sound  concepts  and  design 
principles  provides  far-reaching  benefits, 
lh is  conceptual  base  Is  required  to  be  succes- 
I'ul  in  the  typical  commercial  systems  environ¬ 
ment  of  evolution,  growth,  and  change.  In 
addition  to  the  concepts  and  Ideas  previously 
mentioned,  the  following  are  representative 
complementary  design  principles  which  have 
proven  successful  at  Burroughs. 

Recursive  Definition,  This  simple  approach 
can  Bo  employed  to  verify  the  consistency, 
completeness,  and  orderliness  of  a  defined 
object.  Several  current  notation  systems  per¬ 
mit  solution  definition  as  a  recursive  process. 

Minimal  Representation  of  Information.  Not 
all  Information  has  the  same  Importance  when 
considered  in  a  language,  program,  or  system 
context.  The  use  of  a  higher-level  programming 
notation  wherein  Information  can  be  represented 
as  appropriate  to  Its  static  and  dynamic  usage 
Frequency  offers  some  Interesting  options  to  be 
exploited  by  system  Implementors.  As  an  example, 
Don  Knuth  has  reported  on  the  extremes  In 
l  OUTRAN  function  usage  In  that  operational  lan¬ 
guage  environment. 17  This  representational 
freedom  allows  for  significant  systems  perfor¬ 
mance  trade-offs  to  be  effected.  Wayne  Wilner, 


In  his  paper  on  B1700  memory  utilization,  pre¬ 
sents  some  Interesting  observations  and  comments 
on  the  dramatic  effects  which  may  be  achieved 
through  optimal  information  representation. 19 

The  principle  of  minimally  representing  in¬ 
formation  Is  consistent  with  the  abstraction  of 
higher-level  languages.  In  natural  languages, 
also,  people  abstract  and  codify  high-usage  com¬ 
munication  sequences  for  efficiency  and  compre¬ 
hension. 

The  Importance  of  Information  Structures. 
Burroughs'"  emphasis  on  the  efficient  handling  of 
information  structures,  particularly  control 
structures,  has  provided  far-reaching  benefits. 
The  use  of  the  stack  In  our  machine  architectures 
for  the  partitioning  and  handling  of  subroutines, 
procedures,  and  processes  has  permitted  the 
practical  application  of  several  of  the  concepts 
and  Ideas  noted  In  this  paper.  Additional  ben¬ 
efits  of  the  use  of  the  stack  mechanism  Include 
those  which  contribute  to  the  multiprogramming, 
multiprocessing,  Information  protection,  and 
control  distribution  facilities  of  typical 
Burroughs  systems. 

Abbreviated  History 

Observers  of  Burroughs  systems  developments 
have  detected  a  consistent  philosophy  regarding 
systems  appearance  from  the  perspective  of 
programmers  and  users.  These  observers  cor¬ 
rectly  concluded  that  the  primary  Impetus  for 
the  control  and  guidance  necessary  to  maintain 
this  image  is  largely  attributable  to  an  in¬ 
formal  and  long-standing  relationship  among  key 
Burroughs  technical  personnel.  This  group 
shares  both  a  personal  rapport  and  a  commitment 
to  a  set  of  system  design  and  use  concepts.  In 
informal  meetings  and  conversations,  Barton, 

Lloyd  Turner,  and  others  have  served  as  a 
catalyst  for  the  elaboration  of  the  original  and 
the  synthesis  of  new  ideas  and  concepts.  With 
this  common  experience  as  a  basis,  it  Is  not 
surprising  that  there  are  repetitions  in  concept, 
approach,  and  appearance  within  the  several 
Burroughs  systems. 

Following  is  a  brief  discussion,  not  neces¬ 
sarily  In  chronological  order,  of  the  evolution 
of  some  attributes  of  higher-level  language 
oriented  systems  at  Burroughs.  Also  Included 
are  observations  on  some  of  the  reasons  for 
particular  developments  or  emphasis. 

The  BSOOD,  B6000,  B7000  Series 

In  the  late  1950s,  Burroughs  Implemented  an 
early  version  of  the  ALGOL  language  on  the 
Burroughs  B2Z0,  a  conventional  machine  of  that 
era.  This  implementation  served  to  prove 
several  of  Barton's  original  higher-level  lan¬ 
guage  machine  concepts.  It  provided  a  vehicle 
for  the  evaluation,  feedback,  and  refinement  of 
an  ALGOL  virtual  machine. 
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The  650C0  jystwi  ms  announced  In  1961.  The 
succteior  BS500,  announced  In  1964.  Included  a 
large,  fait  secondary  storage  facility  and  a 
eore  comprehensive  operating  system,  further 
enhancements  were  announced  with  the  BS7u3  In 
1970. 

The  96500  system,  announced  In  1966,  Incor¬ 
porated  significant  enhancements  to  earlier 
machines  and  Integrated  many  new  Ideas  and 
Innovations.  The  BMOO  system,  which  was 
announced  In  1976,  provided  a  more  effective 
Implementation  of  the  KOOO  series  architec¬ 
ture.  I*,  also  Incorporated  features  and 
function  for  consistent  work  and  resource 
sharing  among  multiple  local  and/or  distri¬ 
buted  systems. 

The  17700  large-scale  system,  which  was 
Introduced  In  1970,  provided  both  source  and 
object-language  compatibility  with  the  BMOO 
series  systems.  Additionally,  It  offered 
enhanced  performance.  Information  Integrity, 
and  distributed  Input-output  facilities.  Tht 
17000,  Introduced  In  1977,  Is  s  higher  per¬ 
formance  version  of  the  B7000  series. 

Felloulng  are  typical  of  the  Ideas  and 
concapts  of  the  BS000.  B6000,  and  87000 
systems: 

The  Stack.  Many  of  tha  concepts  and  Ideas 
previously  noted  were  applied  In  the  design  of 
the  BSOQO  system.  One  of  the  more  Important 
Ideas  anbodled  In  that  machine  was  the  In¬ 
tegration  of  the  stack  Into  the  machine  archi¬ 
tecture.  The  stack  mechanism  Is  particularly 
effective  In  the  AL60L  language  handling  envi¬ 
ronment.  The  power  of  the  stack  lies  In  the 
control  mechanism  that  can  be  embedded  In  It 
and  Its  use  for  dynamic  temporary  storage. 

This  facility  permits  efficient  evaluation  of 
arithmetic  expressions  and  storage  of  para¬ 
metric  and  control  Information  for  generalized 
subroutine  and  procedure  handling.  It  also 
allows  an  effective  reduction  In  program  stor¬ 
age  requirements  since  the  top  of  the  stack 
provides  sn  Implied  address  for  most  of  the 
order  codes  of  the  machine.  A  complete  des¬ 
cription  of  tho  stack  and  other  features  of  tha 
15000  and  Its  successor,  tho  B5500,  can  be 
found  In  the  Burroughs  Rsfersnca  Manual  on 
those  systems. •«' 


The  stack  1iq)lementat1on  on  the  B5000  and  : 
85500  was  enhanced  during  the  design  of  the 
86500.  An  evolution  of  the  B6500  stack  struc-  j 
ture  Is  employee  In  the  current  Burroughs  66000 
and  B7000  series.  Based  on  oxptrlonct  with  the 
BS500,  the  addressing  machanlsm  for  local  and 
global  variables  was  mors  consistently  developed, 
so  that  the  dynamic  addressing  environment  en¬ 
countered  In  the  execution  or  programs  Is 
maintained  automatically  by  the  stack  and  re-  ! 
lated  structures.  In  addition,  tho  concept  of  ! 
•  "cactus  stack"  waa  Introduced  to  provide  a 
vehicle  for  the  more  orderly  control  of  multi-  i 
programing  and  multiprocessing.  A  good  treat-  : 
sent  of  tho  use  of  the  cactus  stack  In  process  ; 
handling  is  provlded'by  Jack  Cleary  In  Ms  paper1 
on  that  subject.” 

The  cactus  stack  may  be  viewed  as  a  tree  of  ■ 
stacks  with  tha  trunk  containing  the  basic  oper¬ 
ating  systam  process  representation,  trenches 
from  the  trunk  contain  control  and  paramatrlc 
information  for  new  processes  as  they  are 
created.  This  structure  differs  from  conven¬ 
tional  trees  In  that  tha  trunk  can  continue  to 
grew  aftar  branchas  havt  been  created.  One 
graphic  representation  of  this  structure  re-  < 
samples  the  Segue ro  cactus  of  the  southwest 
United  States— hence  the  "cactus  stack"  design-  , 
atlon.  The  paper  by  Erv  Hauck  end  Ben  Dent 
furnished  an  excellent  discussion  of  the  details i 
of  the  B6500  stack.*  Details, may  be  found  In 
the  Systems  Reference  Manual. «  Elliott 
Organlck's  book  on  the  B67O0  provides  a  good 
treatment  of  the  cactus  stack  In  the  context  of  ; 
an  overall  systam  description. 10 

The  Descriptor.  The  descriptor  on  Burroughs'' 
systems  Is'  ■  highly-encoded  sequence  of  program  ' 
which  Is  executed  when  It  Is  encountered  during  , 
accessing  of  Information.  The  descriptor  may  be 
regarded  as  a  generalized  form  of  control  word. 

It  Is  used  to  separata  those  functions  associ¬ 
ated  with  the  Information  definition  and  control 
from  procedural  code.  This  separation  of  des¬ 
cription  and  function  facilitate*  the  handling 
of  data  and  program  while  maintaining  the  high- 
level  abstraction  of  the  user  environment,  flood; 
detailed  descriptions  of  this  powerful  facility 
can  be  found  In  the  paper  by  Heuck.ond  Dent  and 
In  Organlck's  book  on  tho  967007***" 
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A  major  objective  of  the  B2000,  B3000,  and 
84000  systems  design  was  a  family  of  systems 
which  would  be  efficient  at  character  handling. 
Specifically,  the  systems  were  to  provide  an 
effective  and  efficient  host  for  the  COBOL  pro¬ 
gram  environment  and  for  character-oriented 
peripherals  such  as  data  comminl cation  term¬ 
inals  and  magnetic  and  optically  encoded 
document  handlers. 

The  B2500/B3500  systems  were  introduced  in 
1966.  The  B2700/B3700/B4700  enhancements  to  the 
series  were  announced  in  1970  and  1971.  The 
B2800/Q3800/B4800  systems  which  provided  both 
higher  performance  and  machine- language  compa¬ 
tibility  with  earlier  systems  In  the  series, 
were  announced  in  1975  thru  1977.  Many  en¬ 
hancements  to  the  B2000  series  have  been 
Integrated  into  the  B2900  systems  which  were 
announced  in  1979. 

General  Architecture.  The  experience  base 
for  a  machTne  "which  "could  perform  well  in  a 
character-oriented  environment  began  with  the 
B200  systems  of  the  early  1960s  and  included 
observations  and  experience  with  the  B5000  and 
B5500  systems. 14 

The  processor  and  memory  of  the  B2000-B4000 
systems  are  oriented  toward  the  character, 
field,  and  record  requirements  of  the  COBOL  lan¬ 
guage.  The  Instruction  set  accommodates 
variable-length  strings  of  alphanumeric  and 
numeric  representations. 

Because  of  the  dominance  of  fleld-to-field 
operations  in  the  COBOL  operational  environ¬ 
ment,  the  processor  was  designed  to  utilize 
primarily  a  memory-to -memory  instruction  im¬ 
plementation.  Since  the  processor  retained 
minimal  state  between  Instructions,  the  system 
could  quickly  respond  to  interrupts  from  the 
high  frequency  of  input/output  operations  in  a 
typical  data  processing  environment.  This  fast 
interrupt  response  facilitated  the  handling  of 
data  communications  requirements.  It  also  al¬ 
lowed  the  handling  of  the  real-time  functions 
of  high-volume  document  handling  peripherals 
In  a  multiprogramming  mix. 

The  machine  also  incorporated  a  stack  mecha¬ 
nism  to  facilitate  the  handling  of  control  In 
lh*’  COBOL  and  operating  system  environments 
5 i net*  the  stack  was  mapped  Into  the  memory  area 
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for  each  program  or  process,  it  did  not  detract 
from  the  rapid  state-switching  requirements  o» 
the  system. 

EDIT  Instruction,  The  application  of  expert- 
once  ancTobservatlons  for  development  anq 
implementation  of  character  handling  language 
and  functions  is  typified  by  the  B2000  series 
EDIT  Instruction. 

The  character  handling  facilities  of  the  B&OOL 
machine  and  the  necessary  primitives  to  accom¬ 
plish  the  COBOL -specified  MOVE  and  EDIT  functions 
were  not  well  designed  or  implemented  or  that 
machine.  COBOL  was  a  new  programing  language 
at  the  time  of  the  B5000  design.  There  was 
little  experience  with  the  practical  requirements 
of  that  language  environment.  Additional  in¬ 
formation  was  required  on  the  problem  of  mapping 
the  requirements  of  the  MOVE  and  EDIT  functions 
on  the  B5000.  The  compiler  group  developed  an 
enumeration  and  representation  of  the  functional 
requirements  defined  by  COBOL.  They  then  per¬ 
formed  a  simulation  of  the  virtual  machine 
implied  by  that  form  and  semantics.  This  expe¬ 
rience  and  the  resultant  insights  provided  a 
sufficient  basis  for  the  appropriate  generators 
in  the  COBOL  compiler  for  the  B50O0.  The  re¬ 
presentation,  algorithms,  and  techniques  devel¬ 
oped  for  the  B5000  compiler  were  supplemented  by 
the  results  of  observations  on  that  virtual 
machine.  This  experience  served  as  a  basis  for 
the  design  and  Implementation  of  the  MOVE/EDIT 
Instruction  on  the  B2000,  B3000,  B4000  systems. 

On  those  machines,  most  MOVE  verbs  in  COBOL  can 
be  performed  by  a  single  instruction. 

Details  of  the  structures  and  operations 
implemented  on  this  family  of  systems  can  be 
found  In  the  Reference  Manual  for  those  systems. 
11 

The  B1000  Series 

The  current  Burroughs  B1000  series  (B1700, 
B1800,  B1900),  were  designed  to  support  a  multi¬ 
plicity  of  high-level  language  and  processing 
environments.  In  addition,  the  system  was  in¬ 
tended  to  support  the  emulation  of  several 
existing  and/or  proposed  machines. 

The  initial  systems  of  the  B10Q0  series,  the 
B1700s,  were  announced  In  1972  and  1973.  The 
BlfiOOs,  which  incorporated  significant  perfor- 
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mince  enhancements  were  Introduced  In  1976. 
Initial  B1900  systems  were  announced  In  1979. 


and  storage  hardware  fetches  and  stores  one  or 
more  bits  from  any  location  with  equal  facility. 


Tha  Design.  Based  on  analysis  and  experience, 
the  design  warn  concluded  that  the  range  of  repre¬ 
sentations  and  functions  dictated  by  the  proposed 
set  of  programing  languages  and  machines  could 
not  be  directly  accomodated  with  a  single,  com¬ 
mercially  viable  architecture.  A  sufficiently 
small  set  of  structures  and  operators  could  not  be 
defined  which  was  efficient  for  all  languages  and 
processing  environments.  A  machine  architecture 
wet  Indicated  which  could  be  adapted  to  each  pro¬ 
cessing  and  language  requirement. 

The  >1700  system  design  Included  an  attempt  to 
define  a  machine  which  had  no  inherent  structure 
and  no  a  pri — 1  Instructions.  To  satisfy  this  de¬ 
sign  objective,  a  passive  machine  was  required 
which  could  accoamndate  definable  information 
structures  and  Instructions. 

The  design  approach  used  on  the  B1700  system 
was  to  anticipate  a  unique  machine  architecture 
for  each  programing  languaga  and  emulation  envi¬ 
ronment.  The  designers  had  to  consider  both  the 
typical  high-level  forms  of  program  representation 
as  well  as  machine- language  foms  from  existing 
machines.  Restated,  the  B1700  design  objective 
was  to  efficiently  emulate  a  set  of  real  and 
virtual  machines. 

Variable-Field  Handling.  The  ability  to  vary 
themackfne's  Image  for  each  emulation  environ¬ 
ment  Implies  some  very  specific  hardware  and  soft¬ 
ware  adaptations.  Fortunately,  our  experience  on 
several  prior  machine  designs  and  research  pro¬ 
jects  suggested  several  potential  solutions  to 
this  variable-environment  processing  problem. 

It  was  observed  that  data  and  program  are  fre¬ 
quently  not  suited  to  the  representation  imposed 
by  typical  word  or  character  organized  storage  and 
processing  elements.  The  actual  nature  of  program 
and  data  demands  variable  size  representation. 
Considering  the  range  of  storage  and  processing 
environments  of  the  B1700  system,  the  smallest 
unit  of  information,  the  bit,  must  be  addressable 
In  order  to  provide  complete  flexibility  in  the 
mapping  and  processing  solutions.  To  accomodate 
this  requirement,  the  B1700  system  was  designed 
with  a  defined-fleld  storage  capability.  In  this 
memory  systam,  all  storage  Is  addressable  to  the 
bit,  all  field  lengths  are  expressible  to  the  bit. 


The  B1700  processor  was  designed  to  provide  an 
efficient  vehicle  for  the  emulation  of  multiple 
language  processing  environments.  The  Instruc¬ 
tion  set  of  the  machine  Included  primitives  from 
the  set  of  programming  language  and  emulation 
environments  as  well  as  those  which  contribute  to 
the  emulation,  or  Interpretation,  process  Itself. 
For  example,  the  Arithmetic-Logic  Unit  could  be 
parameterized  to  a  width  which  corresponds  to  the  , 
data  or  machine  being  handled.  A  good  exposition 
of  the  B1700  design  was  provided  by  Uayne  Wllner 
In  his  paper  on  that  subject  and  Is  detailed  In 
the  System  Reference  Manual. lz. 13  The  book  by 
Organlck  and  Hinds  contains  an  excellent  des¬ 
cription  of  the  B170O/B1BOO  systems  architecture 
and  application. 20 

Language-Specific  Machines.  The  congruency  of  ■ 
the  functions  dictated  by  a  processing  environ¬ 
ment  and  the  repertotr  of  structures  and  opera¬ 
tors  supported  by  a  machine  generally  determines  , 
the  efficiency  of  a  systam.  For  the  B1000 
systems,  an  "Ideal"  machine  was  designed  for  each 
processing  environment.  Where  an  existing 
machine  was  to  be  emulated,  the  form  and  semantics: 
of  that  machine  constituted  the  definition.  After 
the  machine  definition,  an  emulator,  or  Inter¬ 
preter,  was  developed  which  provided  the  semantic 
definition  of  that  virtual  machine.  Thus,  the 
compiler  writers  had  an  Ideal  machine  structure 
and  operator  set  for  their  object  code.  This 
repertolr  of  structures  and  operators  provided  an 
isomorphic  relationship  between  most  functions 
expressed  in  the  high-level  language  and  the 
target  machine. 

Optimization.  Since  the  virtual  machine  could 
be  adapted  to  each  processing  and  language  envi¬ 
ronment,  facilities  were  integrated  into  the  de¬ 
sign  to  optimize  the  adaptations.  Tools  and 
techniques  were  indicated  which  could  supplement 
our  perception  of  the  environment  with  empirical 
information. 

Both  hardware  and  software  facilities  were 
integrated  into  the  systam  to  permit  static  and 
dynamic  observations  on  the  virtual  machine's 
representation  and  parformanca.  Thesa  observa¬ 
tions  ware  utilized  to  extend  our  .‘now ledge  base 
on  these  language-specific  machines.  Virtual 
machine  definition  and  representation  are  changed 


as  Indicated  by  static  and  dynamic  observations  on 
the  machine's  behavior.  This  technique,  and  the 
adaptability  of  the  machine,  has  permitted  very 
effective  enhancement  and  optimization  efforts  to 

■bn  realized. 

It  should  be  noted  that  the  exclusive  use  of 
higher-level  languages  contributes  significantly 
to  the  success  of  the  optimization  efforts.  The 
use  of  abstract  programming  notations  provides  the 
hecessary  representational  freedom  to  effect  the 
indicated  virtual  machine  changes.  Some  addition¬ 
al  background  material  and  experience  with  the  ap¬ 
plication  of  the  systems  monitor  facility  Is 
provided  by  Russ  Hagen  In  his  paper  given  at  a 
,jfomputer  performance  seminar. 21  A  description  of 
the  supplemental  functions  provided  In  a  perfor¬ 
mance  measurement  subsystem  can  be  found  In  the 
iJSystom  Performance  Monitor  Reference  Manual. 22 

Resource  Management.  The  B1000  systems  support 
■the  concept  that  the  machine  should  manage  Its  own 
■environment.  These  systems  Incorporate  the  stan¬ 
dard  Burroughs  set  of  operating  systems  scheduling 
ai>d  other  resource  management  facilities.  Program 
and  information  segments  are  handled  automatically 
for  both  Interpreter  and  virtual  machine  processes. 

At  a  typical  Installation,  several  language 
environments  may  be  concurrently  active  In  a  mix 
of  programs.  Through  appropriate  information 
integrity  and  resource  management  mechanisms,  each 
user  views  the  system  as  a  dedicated  facility  de¬ 
signed  to  effectively  accommodate  his  particular 
language  environment. 

Summary. 

The  comprehensibility  of  communications  as  a 
result  of  the  exclusive  use  of  higher-level  no¬ 
tations  throughout  Burroughs  computer  systems  en¬ 
hances  their  role  In  human  communication.  The 
development  and  evolution  of  efficient  machine 
architectures  to  support  abstract  Information  re¬ 
presentations  makes  the  use  of  higher-level  lan¬ 
guages  effective  and  practical. 

Ac k  nowl edgement 

Many  people  have  contributed  to  the  set  of 
concepts,  Ideas,  and  design  principles  Included 
In  this  paper.  Their  application  In  Burroughs  Is 
a  tribute  to  the  strong  commitment  and  persis¬ 
tence  of  Bob  Barton  and  the  B5000  team.  This 


group,  and  the  many  participants  In  Burroughs 
developments  over  the  past  20  years,  have  expan¬ 
ded  and  amplified  the  basic  set  of  Ideas. 

The  author  wishes  to  thank  John  McClIntock  and 
Barbara  Bennett  for  their  conscientious  criticism 
of  various  drafts  of  this  paper. 
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Many  high-level  language  machines  in  Japan 
have  been  made  which  can  use  most  high-level  lan¬ 
guages.  Several  proposals  and  experiments  were 
performed  since  the  late  1960S  and  significant 
research  started  after  1975. 

Much  of  them  are  proposed  on  experimental 
machines.  There  are  a  few  coimercial  high-level 
language  machines.  It  is  characteristic  that  much 
LISP  and  APL  machine  research  has  been  achieved  at 
Laboratories  and  Universities  and  a  few  FORTRAN 
and  COBOL  machines  have  been  made  by  computer  man¬ 
ufacturers. 

introduc'  ion 

This  survey  report  is  an  ovarvtew  of  the  ac¬ 
tivities  related  to  high-level  language  machines 
in  Japan.  Commercial,  experimental  and  proposed 
machines  are  covered.  More  space  ia  devoted  to 
significant  characteristics  in  thair  intermediate- 
language  architectural,  hardware  structures,  soft- 
ware/f irmwere/hardvers  tradeoffs  and  avaluation 
data,  rather  than  their  detailed  architectures  and 


hardware  configurations  in  order  to  cover  most 
high-level  language  machines.  For  easy  undsrstand- 
ing  and  clarification  of  their  differences,  arch¬ 
itectural  comparisons  betwaen  high-lavel  language 
machines  for  the  same  high-level  languages  are 
considered. 

High-lavel  language  machine  reaearch  in  Japan 
has  bean  made  for  most  high-lsvsl  languages.  Much 
of  thssi,  however  ,  concentrate  on  experimental-level 
high-lsvsl  language  machines  and  thsrs  art  only  a 
few  commercial-level  high-level  language  machines. 
Several  proposals  and  experiments  were  mads  in  the 
end  of  1960S  and  early  1970S.  significant  research 
efforts  have  started  after  seme  1975,  as  shown  in 
Fig.  1. 

Generally  speaking,  it  is  characteristics  that 
much  research  data  have  been  gathered  on  LISP  and 
APL  machines  at  Laboratories  and  Universities  and 
a  few  FORTRAN  and  COBOL  machines  have  been  made  by 
computer  manufacturers. 

References  ere  listed  at  the  end  of  this  re¬ 
port  in  which  the  reader  can  find  detailed  infor¬ 
mation.  Unfortunately,  most  of  them  are  writtrn 
in  Japanese, 
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High-Level  Utgatg  Hachlnsa 
PL/I  Processors 

PL/ 1  i«  the  Boat  ooeplex  comrcUl  high-level 
language.  Hence  it  ia  time-oonstmdng  to  manipulate 
on  a  conventional  computer .  Therefore,  the  appear¬ 
ance  of  sdvanosd  and  conaiatant  PL/I  processors  hto 
been  desired  for  quite  a  vhile. 

The  flrat  aign if leant  atop  in  the  reaaarch  on 
high-level  language  naohinea  in  Japan  occurred  with 
the  propoeai  for  a  PL/ I  prooeaaor  by  M.  Sugimoto. 

In  1969,  ha  proponed  a  PL/I  processor1  ooepoead  of 
a  tr  ana  later,  called  the  PL/I  reducer,  and  a  hard- 
vare  interpreter,  called  the  direct  proceaaor. 
the  PL/I  reduoar  translates  a  PL/I  prog ran  into  a 
list-structured  interamdiats  language,  DIPL  (Di¬ 
rect  Proceaaor  Input  Language) ,  that  consists  of 
four  parts,  Program  Structure  List  (PSL) ,  Statement 
Ho real  Pore  List  (8KPL) ,  Attribute  List  <AL)  and 
Constant  List  (CL).  Tha  direct  processor  consists 
of  saveral  functionally  autonomous  units,  as  shown 
in  Pig.  2. 

The  PL/ I  reducer  has  bean  ieplaeentad.  For 
typical  scientific  progress ,  the  object  code  length 
has  been  reduced  by  a  factor  of  25%  on  the  averags, 
compared  to  that  of  tha  object  coda  generated  by 
tha  PL/1  coaqpiler  available  at  that  time.  Accord¬ 
ing  to  tha  tiadng  simulation  program  for  tha  direct 
proceaaor,  it  was  shown  that  28%  spaed  gain  over 
the  conventional  computing  system  can  be  obtained 
for  arithmetia/atring  operations . 
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Fig.  2  Block  Diagrsa  of  tha  Direct  Proceaaor 


FORTRAN  Processors 


FORTRAN  or  array  processors  are  only  used  as 
a  commercial  high-level  language  machine  in  Japan. 
Soma  of  them  have  actually  bean  used  as  an  attached 
or  integrated  processor  in  a  conventional  ganaral 
purpose  computer  system  for  performance  enhancement 
of  FORTRAN  program  eaecution.  Also,  in  accordance 
with  recent  urgent  requirements  for  effective  exe¬ 
cution  of  large  ecale  scientific  applications,  more 
powerful  array  processors  have  been  planned. 


In  1973,  S.  Takahashi  et  al.  at  Hitachi  Ltd. 
reported  reeults  of  fundamental,  experimental  re¬ 
search  efforts  on  a  firmware  FORTRAN  proceaaor2, 
where  FORTRAN  source  statements  are  translated  into 
both  reverse  polish  and  adzed  reverse  polish  inter¬ 
mediate  texts,  in  mimed  reverse  polish,  arithmetic 
statements  are  translated  into  reverse  polish  tents 
and  IP  ntatemmnta  are  translated  into  normal  polish 
texts,  except  for  arithmetic  expressions  in  them. 

The  authors  concluded  that  tha  execution  time 
ratio  for  reverse  polish  and  mimed  reverse  polish 
built  in  microprograms,  reveres  polish  in  software 
and  object  machine  oodss  is  0.8  t  1.3  i  9.7  i  1,  based 
on  a  FORTRAN  dynamic  statement  mix.  On  the  other 
hand,  the  object  memory  oapaeity  ratio  is  0.53  i 
0.58  i  0.58  :  1,  baaed  on  a  FORTRAN  static  statement 
mix. 

Tha  rACOM  230-75  APU  (Array  Prooeaaor  unit)3'4 
from  Fujitsu  Ltd.  ia  a  pipelined  vector  machine 
attached  to  a  FAC0K  230-7S  system  in  which  the  APU 
and  CPU  (Central  Proceaaor  in  it)  share  the  main 
memory  (rig.  3).  The  APU  machine  structure  is 
charactariaed  by  various  kinds  of  internal  regis¬ 
ters  (vector  registers,  data  registers  and  base 
raglatars) ,  vector  descriptors  and  powerful  vector 
instructions  for  array  or  vector  operations.  A 
FORTRAN  user's  program  is  written  in  AF-rORTMAN 
which  ia  an  extension  of  standard  FORTRAN  to  in¬ 
clude  vector  functions.  It  was  indicated  that  tha 
maximum  APU  performance  is  22  Mega  FI oating-Po int¬ 
egrations  and  tha  APO  system  performance  of  vari¬ 
ous  application  programs  written  in  AP-FORTRAM  is 
4-20  times  that  for  oor  responding  CPU  prog  rests. 

An  APU  system  was  installed  in  Japan's  National 
Aerospace  Laboratory. 

Tha  IBM  System/ 363  -  2938  AP  and  tha  PAOON 
230-75  APU  are  an  attached  processor  to  the  central 
processor  through  an  I/O  channel  or  a  shared  main 
memory.  In  order  to  solve  problems ,  wherein  e  large 
amount  of  hardware  was  necessary  and  that  a  special 
description  using  non-standard  FORTRAN  would)  be  re¬ 
quired,  Hitachi  Ltd.  developed  the  N-180  UP  (Inte¬ 
grated  Array  Processor)5  where  erray  processing 
functions  are  included  within  a  central  processing 
unit  as  a  general  instruction  set  (vector  instruc¬ 
tions).  A  concise  vector  instruction  sat,  consist¬ 
ing  of  28  instructions,  wsa  seleoted  based  on  an 
analysis  of  tha  statistics  on  tha  behaviour  of 
FORTRAN  programs,  obtained  using  a  software  tool, 
FORMAT  e.  in  the  N-180  UP,  FORTRAN  user's  programs 
written  in  standard  FORTRAN  era  vectorised  through 
the  vectorising  FORTRAN  compiler6.  It  was  shown 
that  about  50%  of  the  benchmark  programs  using  exe¬ 
cution  steps  can  be  vectorised  by  28  vector  instruc¬ 
tions. 
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DAS IC  Machines 

In  1974,  Y.  Nagai,  M.  Yamamoto  at  al.  of  NEC 
Ltd.  quantitatively  analysed  tof tware/f irmware/ 
hardware  tradeoffs  In  a  BASIC  interpreter.  For  this 
purpose,  three  kinds  of  high-level  language  ma¬ 
chines,  a  software-implemented  BASIC  interpreter 
(s-basic)  ,  a  firmware -imp lamented  interpreter  (F- 
I1ASIC)  and  a  firmware  implemented  interpreter  with 
additional  hardware  (H-BASIC) ,  were  implemented. 
f-Basic^  is  implemented  with  firmware  on  the  Oener- 
al  Purpose  Microprogrammed  Simulator  (GPH3)*1,  To 
reinforce  the  F-BASIC  performance,  hardware  func¬ 
tions,  such  aa  transfnr/pointer  operations,  associ¬ 
ative  functions  and  so  on,  were  introduced  into  the 
H-basic8  on  the  microinstruction  level.  Each  BASIC 
processor  translate*  a  BASIC  program  Into  a  same 
intermediate  language,  and  then  interprets  It. 
Experimental  result*8  show  that  17  times  perform¬ 
ance  improvement  is  obtained  by  adopting  firmware. 

:i.c  times  more  performance  improvement  was  obtained 
by  introducing  appropriate  hardware  functions.  The 
memory  .rapacity  necessary  for  a  language  processor 
wan  also  reduced, 

M.  Yamamoto,  an  Implementor  of  the  precoding 
experiment,  proposed  an  advanced  high-level  lan¬ 
guage  architecture10  for  a  BASIC  machine  as  an  ex- 
tonstion  of  ihr.  abova  three  BASIC  interpreters  in 
1975.  The  BASIC  machine  is  capable  of  both  trans¬ 
lation  and  Interpretation  of  a  BASIC  program  and  is 
characterised  by  a  tagged  architecture .  •  a  large 
number  of  general  purpose  registers  and  powerful 
machine  instructions.  In  addition,  bit-handling, 
masking  and  table-pointer  operations  are  also  in¬ 
stalled.  It  was  estimated  that  the  BASIC  machine 
performance  is  about  2  tine*  that  of  r-BASIC. 

T.  Maruyama  of  liimeji  Institute  of  Technology 
made  a  BASIC  interpreter11 • 12  on  a  general  purpose 
minicomputer,  HF-21MX,  Using  a  software  translator, 
BASIC  programs  are  translated  into  intermediate 
languages,  whiah  are  interpreted  by  a  firmware  in¬ 
terpreter.  In  the  interpreter,  coanonly  usable 
functional  routines  for  such  as  table  pointar/entry 
manipulations,  data  conversions  and  arithmetic  oper¬ 
ations,  rather  than  for  the  whole  of  a  special 
statement,  are  implemented  with  microprogram  tech¬ 
niques,  based  on  execution  frequency  evaluation 
data.  The  microprogram  amount  is  about  1.3k  words. 
A  firmware  BASIC  interpreter  is  about  4  to  9  times 
faster  than  a  software  version  on  benchmark  test 
programs. 

COBOL  Machine 

COBOL  is  the  most  cosmonly  used  commercial 
programming  language.  It  Is  used  for  some  70%  of 
all  progr earning.  Therefore,  hitherto,  conventional 
computers  with  specialized  functions  or  architec¬ 
ture  for  COBOL  and  COBOL  machines  sppsarsd  at  the 
cnutnercial  level  overseas. 

on  the  other  hand,  in  Japan  an  ckperimental 
COBOL  machine11  similar  to  NCR  COBOL  Virtual  Machine 
has  been  put  into  implementation  sines  1975  In  NEC 
l.td.  The  COBOL  machine  architecture,  called  COMBAT 
(cobol  Oriented  Machine  Basic  Architecture) ,  has 
many  facilities  for  efficient  COBOL  program  execu¬ 
tion,  e.g.  many  internal  data,  data  descriptor*  and 
intensive  COBOL  function  capabilities.  Th".  COBOL 
machine  hardware  is  functionally  composed  of  three 


processor  modules  for  instruction  fetch,  operand 
fetch  and  instruction  execution  as  shown  in  Fig.  4. 
It  was  indicated  that  the  COBOL  machine  execution 
time14'16  is  about  3-5  times  faster  than  that  in  a 
medium  scale  conventional  computer.  The  COBOL  ma¬ 
chine  is  running  as  a  processor  attached  to  the 
conventional  commercial  computer. 


AC:  Advance  Controller 
ILF:  Intermediate 
Language  File 
FIFO:  First  In  Firet 
Out  Memory 

IFPM:  Instruction  Fetch  Processor  Module 
OFPM:  Operand  Paten  Processor  Modula 
EXPM:  Instruction  Execution  Processor 
Module 

NCPM:  Memory  Control  Processor  Hodule 
l#t:  Main  Himu*  y 

Fig.  4  COBOL  Machine  configuration 
LISP  M:  chine  > 


The  mos ;  researched  high-level  language  machine 
in  Japan  <9  t  LISP  machine.  Since  the  LISP  language 
has  many  invasive  characteristics,  e.g.  dynamic 
data  allocation,  recursive  function  call  and  list 
porceasinq,  it  is  impossible  to  effectively  execute 
LISP  programs  on  conventional  computer*.  Increase 
in  research  areas  for  symbol  manipulation  and  advent 
of  low  cost,  highly  functional  and  easily  usable 
microprocessors  havs  been  accelerating  the  demand 
for  LISP  machines  sines  1970  in  Japan. 

An  early  experiment  on  a  LISP  suichin*  was  made 
by  T.  Shimada  st  al.  of  Electotechnical  Laboratory 
(ETI.)  in  1974.  LISP  machine  research  in  ETL  nas 
been  performed  in  three  step*.  The  first  experiment 
involves  a  microprogrammed  LISP  interpreter16'17  on 
a  user  mic reprogrammable  computer,  HP-21MX. 

A  Babrow  stack  model  is  implemented  with  micropro¬ 
gram  techniques,  on  which  LISP  interpreter  is  made 
with  LISP  oriented  highly  efficient  instructions. 
Also  backtracking  and  ooroutina  functions  axe  adopt¬ 
ed.  It  was  concluded  that  about  5  to  6  tiaaa  faster 
than  HP-2100  machine  instruction  codes  is  attained. 
Moraover,  much  basic  evaluation  data  about  micro- 
prograaatad  LISP  intarpratar  wera  obtained.  It  is 
shown  that  highly  efficiant  decision  making  includ¬ 
ing  multi-path  jump,  racuraiv*  call  at  ths  micro¬ 
program  control  level,  bit  manipulation  and  main 
memory  control  are  effective  for  a  LISP  interpreter. 
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R«(*d  on  thecu  uvaluutlon  dut-i  and  experience, 
now  LISP  machine  (ETL  LISP  Implemented  on 

«  universal  emulation  machine,  ACC  (Adapt ivo  Com¬ 
puting  Element)^.  Internal  data  fo mv»  and  intcr- 
protai  structure  (or  thia  LISP  machine  a re  identi- 
cal  to  tha  HP-31NX  vara Lon .  In  ordnr  to  attain 
better  performance,  however,  all  tha  intarpratar 
is  written  in  sdcroprogrms,  and  stack  configuration, 
hardware  rag  if,  ter  utilisation  and  memory  management 
arc  improved  due  to  using  advancad  AC2  hardware 
(aeilitiaa. 

In  addition,  virtual  LISP  machine19  it  being 
implemented  on  a  power tui  16-bit  microcomputer, 
whoaa  concept cal  atructura  it  shown  in  rig.  5. 

In  tha  virtual  LISP  euwH.nc,  intermediate  language 
luutructiona  directly  corresponding  to  LISP  func¬ 
tions  are  consider*  >. 
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Tig.  5  conceptual  Structure  of  the  Virtual  LISP 
Machine 


LV4F  nachina  NK320'22  of  Kyoto  University  t» 
baaed  on  a  LISP  oriantad  special  processor,  which 
ia  ?2-bit  data  langth,  42-bit  microinstruction 
length  and  64-bit  list-call  lonqth.  Mao,  it  hat 
special  hardware  units,  such  as  a  transfer  table 
for  generating  microinstruction  branch  addresses  to 
aid  checking  for  tag  field  and  data  category  end  a 
hardware  stack,  whoaa  top  areas  are  always  stored 
in  a  feet  buffer  memory.  HK1  has  about  100  macro- 
instruetlon*  mainly  for  stack  and  tag  manipulation, 
in  order  to  affectively  execute  LISP  functions. 

Ti.c  processing  spaed  of  a  LISP  interpreter  on  NF.l 
is  5-6  tlmea  that  of  a  LISP  system  on  a  general 
puri'cav  minicowputtur .  rltiurc  6  cIiuwh  an  NK3 
Llnckdiaqram. 


rig.  6  Slock  Diagram  of  LISP  Machine  NK1 


Research  on  LISP  Mchinat  in  Japan  was  pro¬ 
moted  by  the  advent  of  low-cost,  high-performance 
and  easily  usable  microprocessors ,  specially  bit 
or  byte  slice  microprocessors. 

K.  Taki  at  al.  at  Kobe  University  developed  a 
lisp  processor^,  23,  organised  with  4-bit  slioe 
microprocessors  (Am  2900  aariaa) ,  which  has  16-bit 
data  langth,  $6-bit  microinstruction  langth  and 
32-bit  list-cell  length.  It  also  has  special  hard¬ 
ware  components  characterised  by  a  16-bit  4-k  word 
hardware  stack,  a  field  extractor  for  data  masking 
and  shifting,  a  3-bit  1  k  word  mapping  memory  gen¬ 
erating  a  3-bit  uaega  coda  corresponding  to  tha 
main  memory  address  and  a  1-bit  64  k  word  bit-table 
supporting  qarbage  collection  function.  Figure  7 
shows  tha  hardware  atruoture  for  tha  Kobe  University 
LISP  machine,  which  ia  connected  to  a  general  pur- 
poso  computer,  FACOK  230-38,  through  an  8080  micro¬ 
computer.  A  DEC  LSI-11  minicoaputar  performs  ini¬ 
tiation  and  maintenance  functions,  LX8P  program 
loading  and  input/output  operations. 


MW  . . . 

Piq.  7  Hardware  Configuration  of  A  LISP  Machine 
System 


T.  Usuki  at  el.,  from  Kaio  University,  imple¬ 
mented  a  LISP  machine24'25  on  a  multi -microproces¬ 
sor  system,  which  is  composed  of  an  interpreter 
processor  (IP) ,  a  storage  management  processor  (BMP) 
and  an  input -output  processor  (XOP) .  IP  performs 
overall  control  of  LISP  program  prooessing  and  LISP 
program's  interpretation,  and  has  a  16-level  hard¬ 
ware  etack  for  sequence  control  and  list  manipula¬ 
tion  capabilities.  Garbage  collection  end  cons, 
RFLACG  end  RPIACD  function  execution  are  achieved 
independently  of  interpretation  on  BMP,  which  ia 
organised  of  byte-elice  microporcassore,  32  special 
registers  and  a  writabla  control  storage.  Garbage 
collection  function  is  attained  based  on  DijXetra'e 
algorithm,  ion,  a  general  purpose  minicomputer 
(NOVA) ,  accomplishes  input  operation  of  a  LISP  S- 
expreasion,  conversion  from  it  to  internal  forma 
and  fils  processing.  Figure  8  shows  tha  configura¬ 
tion  of  an  experimental  multiprocessor  system. 
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H.  Yasui  et  a  1.  of  Osaka  University  have  boon 
dove l oping  a  new  multiprocessor  LISP  machine,  EVLIS 
ui.n.-hino2<’i  27 .  In  a  traditional  multiprocessor  LISP 
nu'-imin,  list  processing  and  garbage  collection  or 
I/O  processing  are  performed  in  a  parallel  mode, 
on  the  other  hand,  in  EVLIS  machine,  each  argument 
for  a  LISP  function,  EVLIS,  is  parallelly  evaluated 
in  multiple  processors.  It  is  based  on  the  concept 
that  parallel  interpretation  of  EVLIS  arguments  is 
possible  if  an  argument  evaluation  does  not  affect 
the  other  argument  because  of  its  list  alteration 
operation.  Figure  9  shows  the  system  con  figurat  Loir 
of  the  EVLIF,  machine,  in  which  an  evaluation  proc¬ 
essor  can  accomplish  an  argument  interpretation. 

An  evaluation  processor  is  organized  of  Intel  bit- 
i.lice  microprocessors,  1  3000  seric.  and  is  20- 
hit.  data  length  and  50-bit  microinstruction  length. 
A  Hi-bit  list  ceil  can  be  brough  into  a  CAK-CDR 
i eg  niter  from  a  main  memory.  When  there  in  gar¬ 
bage  collection  function  requirement,  all  ovaluu- 
i  urn  processors  stop  interpreting  EVLIS  arguments 
and  paral! oily  perform  their  function.  A  simula¬ 
tion  result  related  to  the  performance  enhancement 
due  to  multi  processors  was  shown  in  the  paper21. 

Typical  LISP  machines  have  boon  surveyed, 
i able  1  shows  a  summary  of  their  major  character¬ 
istics.  Hr  addition,  there  are  other  research  ef¬ 
forts  related  to  LISP  machines.  ALPS/I  (Aoyama 
Lint  Processing  Systom/I)28  j,9  a  compact,  low-cost 


LISP  machine  on  a  universal  8-bit  microprocessor 
(t  SOHO) .  L.  (loto,  T.  Ida  el  al.,  of  the  Insti¬ 
tute  of  Physical  and  Chemical  Research,  are  design¬ 
ing  a  machine  fot  numerical,  symbolic  and  associa¬ 
tive  computing,  FLATS  (Fortran  and  Lisp  machine 
with  Associative  features  for  Tuple*  and  Sets) 28. 

In  FLATS,  overflow  free  and  variable  precision 
arithmetic,  tabic  look-up  computation,  and  associa¬ 
tive  computation  are  realized  by  hashing  hardware, 
tug  mechanism  and  hardware  list  processing. 
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1'ig.  9  system  Configuration  of  EVLIS  Machine 


Table  1  Architectural  Comparison  between  LISP  Machines 
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APL  Interpreters 

APL  has  many  features  to  be  implemented  by 
fi rmwarc /hardware  techniques,  some  of  which  arc  (1) 
dynastic  data  and  dissension  attributes  associated 
with  variables,  (2)  various  operators  to  be  applied 
to  vector  and  array  operands,  and  (3)  a  large  num¬ 
ber  of  nonstandard  operators.  Moreover,  because 
APL  allows  dynastic  data  handling  and  because  it  is 
an  interactive  language ,  data  type  checking,  sub¬ 
script  checking  and  text  editing  arc  to  be  perform¬ 
ed  at  execution  tiate. 

In  order  to  oueroone  inefficiency  in  APL 
software  interpreter  due  to  these  features,  soate 
microprogrammed  APL  interpreters,  similar  to  IBM 
Hassitt's  machine,  are  experiaMntally  implemented 
on  a  microprogramsted  computer  since  1975  in  Japan . 
Various  quantitative  evaluation  data  about  firm¬ 
ware  effectiveness  in  an  APL  interpreter  were 
accumulated. 

In  1975,  an  early  experiswnt  on  a  firmware 
APL  computer 2°  was  made  by  T.  Motooka  ct  al.  at 
Tokyo  University  on  an  experimental  machine,  PPSl43. 
An  APL  source  text  is  translated  into  an  interme¬ 
diate  language  an  a  one  for  one  basis  by  a  lexical 
analyser  written  in  a  microprogram.  An  intermedi¬ 
ate  language  is  composed  of  identifiers,  operators, 
constants  and  brackets.  The  order  of  elements  for 
a  statement  is  same  in  the  internal  representation. 
The  interpreter  is  written  in  microprograms  and 
APLs.  Both  the  lexical  analyser  and  the  interpret¬ 
er  are  implemented  on  a  microprogrammed  experimen¬ 
tal  computer,  PPS1.  The  authors  concluded  that  the 
firmware  APL  computer  is  much  slower  than  an  APL 
machine  in  software  on  scalar  operations,  but  fast¬ 
er  on  many  vector  operations. 

H.  Miyawaki  at  al.  of  Himeji  Instituta  of 
Technology  made  a  firmware  APL  interpreter31* 33, 
baaed  on  a  quantitative  enalysie32  of  the  inter¬ 
pretation  part,  which  is  implemented  In  software 
on  e  general  purpose  minicomputer,  HlTAC-10.  An 
APL  source  statement  is  translated  into  an  inter¬ 
mediate  text  which  is  composed  of  32-bit  text  ele¬ 
ments  followed  by  an  and  oloewnt  an  shown  in  Pig. 

10.  It  was  indicated  that,  in  a  f irmwarixntion, 
appropriate  functional  modules,  frequently  used  to 
implement  an  APL  interpreter,  are  to  be  selected 
rather  than  all  of  an  APL  statement.  As  a  result 
of  this  experimsnt,  it  is  shown  that  a  firmware 
interpreter,  ewde  of  about  4.8-k  words  micropro¬ 
grams  is  6  times  faster  than  a  software  version. 
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Y.  Horimoto  from  Toshiba  Ltd.  implemented  a 
firmware  APL  interpreter,  APL/ EPOS  I  interpreter3**3*, 
on  an  EPOS  (Experimental  Polyprocessor  System)  sys¬ 
tem*4,  whose  component  processor  is  organised  of  e 
universal  host  microprocessor,  PULCB  (A  high  per¬ 
formance  universal  computing  element) 45,  dedicated 
to  emulation  with  powerful  adcroinstruetion  sets, 
various  kinds  of  hardware  registers  and  SO  on. 

APL  source  statements  are  translated  intO  interme¬ 
diate  texts  similar  to  the  preceding  firmware  APL 
interpreter  by  a  translator  written  in  pseudo  APL 
language  (PAPL) ,  which  ia  emulated  with  micropro¬ 
grams.  cn  the  other  hand,  intermediate  texts  are 
interpreted  by  PAPL  and  microprograms,  and  micro¬ 
programs  mainly  play  scanning  for  Intermediate 
texts,  decision  on  operation  category  to  be  mani¬ 
pulated  and  execution  of  basic  APL  operators. 
According  to  evaluation  data,  API/ EPOS  I  interpret¬ 
er  is  100  times  faster  than  a  software  version,  on 
some  APL  functions.  Also,  it  is  faster  than  the 
execution  of  object  codes  generated  by  a  compiler. 

Moreover,  Mother  similar  research  effort3* 
has  bean  carried  out  on  a  dynamic  mic reprogrammable 
computer,  QA-146,  by  K.  Kinoshita  at  al.  of  Kyoto 
University.  Various  unique  experimental  results 
will  be  obtained  because  of  many  special  QA-1  fea¬ 
tures,  e.g.  hardware  stacks,  low-level  parallel 
processing  capabilities  due  to  using  four  ALUs  end 
tag  minipulation  functions. 

PASCAL  Machine 

The  use  of  a  structured  high-level  language, 
PASCAL,  it  increasing  due  to  its  high  portability, 
prog raemer/execut ion -efficiency  and  compactness  of 
language  processing  system.  At  the  same  time,  in 
order  to  effectively  execute  PASCAL  programs,  PASCAL 
machines,  such  as  PASCAL  Microengine  of  Western 
Digital  Corp.,  have  appeared . 

T.  Furuya  of  ETL  experimentally  implemented  a 
concurrent  Pascal  Machine37  on  the  multiprocessor 
system  (ACE)42,  baaed  on  P.B.  Hansen's  Concurrent 
Pascal  Machine.  An  interpreter  to  execute  Concur¬ 
rent  Pascal  Machine  (CPM)  instructions  and  s  Kernel 
to  supervise  parallel  processes  ware  made  with  both 
PDP-11/45  instructions  end  CPM  oriented  language 
(C- language)  which  wars  emulated  with  ACE  system 
microprograms.  C-language  consists  of  conventional 
machine  instructions  like  PDP11/45  and  frequently 
used  CPM  instructions.  In  order  to  parallelly  exe¬ 
cute  multiple  processes  on  a  multiprocessor  system, 
process  synchronisation  instructions  end  I/O  opera¬ 
tions,  having  a  proctss . schedule  function,  ere  in¬ 
troduced  to  the  Kamel  with  the  aid  of  an  ACE  syn¬ 
chronisation  module;  As  a  result  of  the  experiment, 
various  valuable  evaluation  data  ware  shown,  end 
groat  decrease  in  overhead  time  was  attained  by 
parallel  execution  of  processes  and  afficiant  proc¬ 
ess  switching. 

other  Research  Efforts  on  High-level  Language 
Machine  Design  Problems 


In  addition  to  high-level  language  machine  im¬ 
plementation  efforts  described  earlier,  a  nimfcer  of 
other  research  efforts  related  to  high-level  lan¬ 
guage  machine  design  problems  have  been  swde.  The 
intermediate  language  architecture  of  a  high-level 
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1  nnguage  machine  is  one  of  major  keys  for  success¬ 
ful  implementation.  Some  evaluations38' 39  on  this 
problem  were  accomplished.  Moreover,  the  problem 
of  a  multilingual  high-level  language  machine  was 
considered^. 


Summary 

High-level  language  machines  .in  Japan  were  sur¬ 
veyed.  Generally  speaking,  much  of  them  are  at  the 
stage  of  fundamental  and  experimental  research  com¬ 
pletion.  In  the  future,  the  appearance  of  regular 
commercial  high-level  language  machines  and  the 
confirmation  of  their  effectiveness  will  be  desired. 
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ABSTRACT 

l"he  SYMBOL  system  is  the  prime  example  of 
the  actual  construction  and  use  of  a  high  level 
language  computer.  It  is  unique  in  the  architecture, 
the  instruction  set,  and  the  language.  This  paper 
attempts  to  summarize  some  of  the  lessons  learned 
from  the  machine  during  the  last  eight  years  of  its 
use.  Comments  are  made  on  the  high  level  instruc¬ 
tion  set,  and  how  the  descriptor  and  tag  mechanisms 
aftected  the  system.  Several  of  the  processors  are  dis¬ 
cussed,  including  the  automatic  memory  management 
anil  the  hardware  implemented  operating  system. 

The  difficulties  encountered  in  debugging  the 
limdwarc  and  the  software  are  comparer! 

Introduction 

<.'ne  of  the  most  radical  computer  architectures  of  the  last  decade 
was  unveiled  in  1971  with  the  announcement  of  the  SYMBOL1  1  com¬ 
puter  system  The  prime  goal  of  the  SYMBOL  research  project  was  to 
demonstrate  with  a  full-scale  working  computer  that  a  procedural 
general-purpose  programming  language  and  a  large  portion  of  a  time- 
shared  operating  system  could  be  implemented  directly  in  hardware, 
resulting  in  a  marked  improvement  in  computational  rates.'  A  further 
goal  was  to  show  that  such  a  task  could  be  mounted  by  a  relatively 
.-.mall  group  ol  people  in  a  reasonable  amount  of  lime  through  the  use 
of  appropriate  design  tools  and  construction  techniques.  The 
announcement  and  initial  papers  on  this  computer  system  were  made 
at  a  time  when  it  was  not  yet  fully  operational,  and  was  being  moved 
to  Iowa  Stale  University  for  final  debugging,  evaluation  and  use. 
Alter  arrival  at  ISU  the  computer  was  made  fully  operational,  and  was 
used  in  a  programming  environment. 

It  would  be  nice  if  a  definitive  statement  could  be  made  neatly 
categorizing  all  of  the  successes  and  failures  of  the  project. 
Unfortunately,  such  data  waa  remarkably  difficult  to  collect,  project 
members  still  disagree  on  many  issues.  Part  of  the  problem  in  evaluat¬ 
ing  SYMBOL  was  that  the  machine  was  radically  different  from  tradt 
tional  computes  in  so  many  ways  that  u  controlled  comparison  was 
practically  infeasible.  Nevertheless,  we  feel  it  is  important  to  state  our 
opinions;  it  should  be  understood  that  the  following  comments  urc  per¬ 
sonal  observations  by  the  authors,  based  upon  four  years  of  daily  con¬ 
tact  with  the  SYMBOL  machine  In  defense  ol  (he  original  designers 
of  the  machine,  we  feel  it  necessary  to  reiterate  that  SYMBOL  was 
intended  as  a  learning  device,  rather  than  as  a  commercially  viable 
prixluct. 

'Wart  Untie  »1  lows  suite  Umvcony  under  NSF  |rum  GJMUV7X 


Background 

The  roots  of  SYMBOL  go  back  as  far  as  1964,  when  it  was 
decided  by  a  group  of  engineers  at  Fairchild's  research  facility  in  Palo 
Alto.  California  that  the  future  of  integrated  circuit  technology  dic¬ 
tated  the  use  of  hardware  for  traditional  software  functions  The 
design  of  the  system  was.  and  still  is,  a  unique  example  of  a  completely 
top  down  design.  If  was  fell  that  existing  programming  languages  had 
hecn  influenced  too  heavily  by  the  underlying  hardware,  and  that  valu¬ 
able  piogrammer  lime  was  unnecessarily  being  spent  performing  func¬ 
tions  such  as  memory  managemenl  because  of  unreasonable  computer 
architectures.  A  high  level  language  computer  was  seen  as  an  answer 
to  reducing  rising  software  costs. 

One  of  the  first  tasks  tackled  was  the  specification  of  a  new  pro¬ 
gramming  languagt(SPL)15  along  the  lines  of  ALGOL  60  and  I’Ll, 
hut  without  underlying  machine  influences.  The  language  was 
designed  for  processing  character  oriented  data  that  could  be  variublc 
in  type,  shape  and  size.  Rigid  type  and  si?e  declarations  that  would 
normally  aid  a  compiler  were  omitted  from  the  language  as  they  were 
seen  to  burden  the  user;  conversions  and  space  management  were  han¬ 
dled  automatically  by  SYMBOL'S  hardware.  Structures  of  arbitrary 
shape  were  to  be  explicitly  representable  in  the  language.  A  top  down 
design  was  derived  from  the  language  specification  and  the  desire  to 
support  multiple  users  in  an  interactive  environment.  Purl  ol  the 
research  effort  was  to  probe  the  limits  of  hardware;  even  such  tradi¬ 
tional  software  functions  as  the  text  editor  were  pul  in  hardware  The 
system  was  designed  so  that  a  user  could  walk  up  lo  a  cold  computer, 
tom  it  on,  and  have  all  the  functions  necessary  to  begin  programming 
in  a  high  level  language  using  virtually  no  system  software.  The 
resources  needed  to  design  this  complex  hardware  were  substantial  A 
computer  aided  design  system'’ 7  was  developed  to  check  timing  and 
loading,  to  do  placement  and  wire  routing,  and  to  maintain  a  system 
for  documenting  the  circuitry  of  more  than  20,000  packages. 

At  the  time  that  the  fabrication  of  SYMBOL  was  completed  and 
debugging  began,  the  semiconductor  industry  was  in  a  recession  and  a 
managerial  decision  was  made  not  lo  continue  the  project  through  a 
second  design  that  Iowa  Slate  University  was  to  have  received  for 
evaluation.  Instead  ISU  obtained  the  original  machine  from  Fairchild 
in  1971.  through  a  grant  from  the  National  Science  Foundation,  foi 
(he  purpose  of  bringing  the  machine  to  full  operalii'n  mi  that  the 
unique  ideas  of  the  architecture  could  he  more  fully  documented  and 
evaluated.  At  ISU  the  machine  was  brought  into  useful  operation  by 
I97.V  Work  on  the  system  software  and  hardware  was  done  by  a 
group  of  about  six  people,  mainly  graduate  students.  Funding  lor  the 
project  terminated  in  1978.  and  shortly  afterwards  hardware  failures 
forced  the  machine  to  be  permanently  decommissioned. 


Espsrtsac*  wkh  a  Hi^i  Lml  ImtructiM  Set 

The  SYMBOL  instruction  set*  v  reflects  the  SYMBOL  Program- 
•nintt  I.anguugc  with  atmort  a  one-to-one  correspondence  between 
tokens  in  the  wuree  and  the  object  code.  The  hardwired  Translator 
take*  a  source  program  and  generates  an  internal  postfix  representation 
to  be  ease  Mad  by  the  Central  Processor  All  operators  are  generic;  the 
type*  of  tyanmdt  are  determined  from  the  descriptors  and  type  tags 
umuciated  with  each  identifier  or  constant.  The  instruction  set  is 
aesthetically  appading  in  its  simplicity.  There  are  upprosimately  fifty 
imeraethms.  only  sis  of  whieb  require  an  address  field  All  relctenevs 
lo  identifiers  art  made  with  an  instruction  that  contains  the  address  of 
the  identifier's  descriptor.  Constants  may  appear  in-line  and  are 
always  lagged  The  advantages  of  the  instruction  set  would  appear  to 
be  its  semantic  conciseness  end  uniform  mechanism  for  referencing 
data. 

Code  compaction 

There  ere  several  problems  with  the  high  level  nature  of  the 
instruction  set.  only  e  few  of  which  ere  specific  lo  SYMBOL  The 
high  level  and  postfix  stuck  orientation  of  the  instruction  set  were 
expected  to  give  good  code  compaction.  Closer  examination  however 
revealed  that  SYMBOL’S  code  was  much  less  compact  for  typical  pro¬ 
grams  than  on  traditional  machines  such  as  the  IBM  Ml  or  PDF- 1 1 
Several  factors  account  for  this  poor  code  density .  A  substantial  frac¬ 
tion  of  the  object  code  consisted  of  non  functional  "end  of  statement" 
operations,  debugging  links  pointing  to  the  source  program  and  No- 
Ops.  Code  density  wes  also  lost  due  the  fact  that  opcodes,  which  are  I 
byte  in  length,  could  be  placed  only  in  the  first  or  fifth  bytes  of  the 
eight  byte  word,  thus  wining  three  bytes  for  each  opcode  that  did  not 
require  an  address  field.  The  Translator  contributed  to  the  problem  by 
producing  extremely  poor  code,  at  limes  even  rcpiicuting  non¬ 
functional  instructions.  The  strict  one-to-one  correspondence  between 
source  and  object  code  resulted  in  the  absence  of  many  instructions 
that  could  have  been  useful  in  optimizing  for  common  special  cases. 
Examples  n(  such  instructions  would  be  increment,  set  to  zero,  and 
append  a  character.  The  unusual  memory  structure  also  hindered  code 
compaction  by  prohibiting  any  address  calculations,  thus  precluding 
space  saving  using  relative  addressing  techniques.  The  lesion  learned 
was  that  code  compaction  does  not  necessarily  result  trom  high  level 
instructions,  and  that  factors  of  two  or  three  in  code  density  can  be 
lost  without  careful  integration  of  the  instruction  sei.  compiler  technol¬ 
ogy  and  the  memory  structure. 

High  Ltvil  Instructions  and  Interrupt  Handling 

An  unexpected  lesion  was  that  there  arc  times  when  instructions 
can  he  ut  too  high  a  level.  Because  of  the  variable  vngth  operands 
und  high  level  operations,  hundreds  or  even  thousand,  of  memory 
references  could  be  required  to  execute  a  single  instruction.  This  had 
rather  severe  consequence*  on  interrupt  handling  (page  fault,  disk  scr- 
vicing,  user  interrupt,  process  switch,  etc.).  Proper  interrupt  handling 
requires  the  ability  to  mo?  execution,  haraiic  the  interrupt,  and  then 
resume  execution  of  the  original  instruction  at  the  point  of  the  inter¬ 
rupt,  For  efficiency  reason*  it  i*  important  to  he  abie  to  stop  execution 
of  an  instruction  (without  completion),  save  all  state  information  active 
in  the  processing  of  the  instruction  and  resume  execution  at  or  near 
the  point  uf  interruption  rather  than  to  restart  exeeutiixi  of  the  instruv- 
lion  from  the  beginning.  For  a  high  level  algorithm,  the  slate 


information  that  must  be  savtd  can  be  rattier  large.  A  large  fraction 
ol  SYMBOL'S  design  bugs  were  the  result  of  the  failure  to  save  all  the 
necessary  state  information.  This  type  of  bog  was  extremely  difficult 
to  track  down,  aa  the  fatal  intcrrufX  was  often  generated  non- 
deterministically  from  combinations  of  dhk  interrupts,  dock  tune-outs 
or  users  pressing  interrupt  buttons.  Another  problem  wn  the  inability 
to  save  ail  the  necessary  information  for  particular  stage  of  the  algo¬ 
rithm.  These  oversif.hu  were  eventually  fixed,  sometimes  at  the 
i-spenxe  of  doling  state  information  at  "convenient  checkpoints''. 
Resinning  at  such  checkpoints  repeated  needles  work  after  tmk  dart- 
downs,  snd  worse,  earned  hundreds  of  times  mote  state  saves  thin 
were  necessary;  this  degraded  system  performance  perhaps  as  much  as 
2tlCt. 

Optimization 

Code  optimization  in  SYMBOL  would  be  difficult  to  achieve 
because  ol  the  generalized  nature  of  the  operatkna.  The  addition  of 
lower  level  instructions  could  have  allowed  optimisation  of  many  epe- 
dal  cases.  For  example,  incrementing  a  variable  on  SYMBOL  could 
lake  over  a  dozen  memory  referencts  due  to  its  Meek  mechanism  and 
indirection  through  descriptors.  The  uniform  referencing  to  data  struc¬ 
tures  meant  that  a  compiler  could  not  optimize  accessing  for  apodal 
cases,  in  particular  a  tremendous  performance  penalty  was  paid  until 
SYMBOL  because  the  memory  structure  made  it  impossible  to  perform 
iruditio’rul  indexing  and  address  calculations.  Even  if  such  indexing 
were  possible,  there  would  be  an  incompatibility  because  of  the  inabil¬ 
ity  lo  do  binary  arithmetic  for  addressing  on  the  decimal  only 
machine. 

Descriptors  und  Tags 

Because  SYMBOL  was  one  of  the  few  examples  of  a  descriptor 
based  machine  and  a  tagged  architecture,  e  few  comments  are 
appropriate.  Operand  and  instruction  tagging  wax  useful  in  catching 
occasional  machine  errors  where,  for  a  number  of  reason*,  a  memory 
reference  relumed  an  incorrect  value.  There  were  never  any  instance* 
where  dam  could  powihfy  be  miosken  for  program  or  vice  vena;  tbit 
did  in  fuel  report  many  machine  error,  that  might  have  gone 
undetected  in  a  traditional  machine.  Tags  were  also  of  greal  benefit  in 
debugging  and  in  developing  sophisticated  software  debugging  tool*. 

Descriptors  had  an  even  stronger  impact  on  SYMBOL,  both 
positive  and  negative.  Descript  on  were  invaluable  in  efficiently  imple¬ 
menting  the  dynamic  typing  present  in  the  language  and  in  the  benefits 
provided  for  debugging  tods.  On  the  other  hand,  implementing  recur¬ 
sion  in  the  SYMBOL  Programming  Language  wee  e  task  left  to  system 
software,  and  turned  out  to  he  extremely  inefficient.  A  ample  test  of 
Aekermann's  function  would  show  SYMBOL  to  be  at  lean  three  ord¬ 
ers  ol  magnitude  slower  then  traditional  machines.  The  main  problem 
was  that  the  descriptors  for  die  entire  procedure  had  to  be  copied  upon 
a  recursive  call  if  the  descriptors  themselves  might  be  modified  in  the 
call  --  a  virtual  certainty  in  SYMBOL. 

Heed  for  a  Systems  Language 

One  of  the  problems  with  the  SYMBOL  language  ana  instruction 
set  wax  that  they  were  not  efficient  for  lower  level  tasks  common  to 
systems  programming.  The  support  toots  an  SYMBOL  could  have 
rieen  more  effectively  supported  though  a  systems  oriented  language 
such  as  BfPL."’  BLISS,"  or  C.12  While  inefficiencies  in  ihort  lived 
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user  programs  could  he  tolerated  the  same  can  nig  he  said  for  system 
software  The  SYMBOL  Programming  Language  turned  out  to  he 
inappropriate  for  systems  programming.  It  is  recommended  that  even 
on  computers  that  intend  to  suppr  rt  only  one  user  la  guage.  a  signifi¬ 
cant  effort  should  go  into  supporting  in  underlying  systems  language 
Addition  of  a  few  lower-level  instructions  could  have  made  SYMBOL 
an  effective  multi  language  system. 

System  Software  and  the  Hardwired  Operating  System 

The  functions  of  a  co  tplete  time-shared  operating  system  were 
implemented  directly  in  hardware  hy  the  System  Supervisor.1'  aided  in 
the  Memory  Controller,  Memory  Reclaimer,  Channel  Controller. 
Drum  Coni. ollei,  and  Input/Outpuf  Processor.  System  software  was 
intended  only  to  handle  certain  exceptional  conditions,  but  in  fact  was 
used  m  a  much  greater  extent  than  the  designers  originally  foresaw 
Sulntantial  efforts  of  the  research  learn  were  spent  on  developing 
loaders,  text  editors,  improved  diagnostics,  debugging  packages,  library 
routines  and  a  file  system.  This  software  wus  seen  as  essential  to  make 
the  system/user  interface  tolerable  System  software  accounted  for 
several  thousand  lines  of  code  by  the  end  of  the  project.  Much  of  the 
success  of  this  software  was  due  to  the  foresight  of  the  designers  in 
providing  "hixiks"  in  the  harowure  for  software  intervention,  allowing 
ihe  system  to  retain  some  flexibility  despite  srs  hardwired  implementa¬ 
tion  14 

Two  important  questions  arc  answered  by  SYMBOL  concerning 
the  Ix'tiefiis  derived  from  implementing  major  parts  of  an  opcr.i'i-i;’ 
system  in  hardware,  hirst,  it  would  seem  that  the  overall  design  costs 
ol  developing  a  hardware  implemented  operating  system  arc  much 
higher  llu-i  an  equivalent  software  implementation,  the  desire  to  lessen 
the  cost  of  developing  an  Operating  system  was  not  achieved.  Snfnwrr 
costs  were  reduced,  but  uvtrall  costs  were  not.  Traditional  software 
bug  fixes  were  merely  exchanged  for  a  "Request  fur  Hardware  Modifi¬ 
cation'  sheet,  the  hound  RFHMs  were  over  four  inches  thick  ••  and 
accounted  only  for  changes  after  the  system  wus  delivered  "debugged' 
to  ISU! 1  The  second  and  more  positive  point  is  that  the  implementa¬ 
tion  of  the  hardwired  operating  system  seems  to  have  been  very  suc¬ 
cessful  from  a  performance  and  programming  standpoint.  Though  the 
mllexibility  of  the  hardware  often  prohibited  changes  towards  more 
'modern''  operating  system  concepts,  the  implementation  was  veiy  suc¬ 
cessful  in  terms  of  the  original  design  goals.  Using  hardware  for 
heavily  used  functions  such  as  process  scheduling.  virtual  memory 
management,  memory  allocation,  and  scheduling  of  multiple  processors 
seems  lo  have  been  a  wise  tradeoff.  It  was  also  shown  that  complex 
hardware  can  he  successfully  interfaced  to  Ihe  suttwmc  part  of  the 
operating  system.  In  terms  of  the  overall  design,  SYMBOL  deserves 
i  ('cognition  as  a  successful  Operating  System  Machine  as  much  as  it 
dues  lui  1  icing  a  High  Level  lamguage  Machine. 

A  fule  of  Two  Processc  -s 

Willie  hardwired  implementation  of  high  level  lunctmns  has  its 
merits,  a  look  at  two  of  SYMBOL'S  processors  might  prove  insightful 
I'cthaps  the  most  striking  aspect  of  SYMBOL  to  a  user  was  the  amaz 
ing  s|iecd  at  which  programs  were  compiled  (70,01X1  to  UXl.lKXi  stale 
menls  pel  minute).  Ihe  SYMBOL  Translator1'  is  probably  the  only 
example  ul  a  compiler  implemented  entirely  with  innilom  logic  The 


I  laudator  is  perhiqis  the  most  amazing  of  SY'MBOL  s  processors  mu 
unit  because  ul  its  tremendous  speed  of  compiling  but  also  in  trial  n 
worked  at  all.  One  ol  the  benefits  of  Ibis  tremendous  translation 
speed  was  that  no  obiect  files  were  saved  This  was  :m  advantage  hi 
saving  storage  space  and  in  insuring  that  object  programs  always 
reflected  Ihe  current  source  program. 

We  do  not  wish  to  imply,  however,  that  such  speeds  arc  gen¬ 
erally  obtainable  from  a  hardwired  compiler  and  a  high  level  instruc¬ 
tion  set.  The  performance  figures  of  SYMBOL'S  Translator  ure  some¬ 
what  misleading  in  that  the  speed  came  primarily  from  two  other  far- 
tors  First,  the  SPL  language4-5  had  a  grammar  designed  to  he  easy  to 
parse.  Nun-optimal  code  was  generated  in  one  pass  with  backpaichmg 
and  without  the  need  for  building  compile-time  data  structures  -The 
high  translation  speed  could  not  be  expected  in  a  proper  impinnemn- 
non  ul  a  compile!  lor  SPI.  or  more  complex  programming  languages 
Second,  the  'translator  did  almost  nothing  more  than  crude  code  gen¬ 
eration  or  assembly  Farm  diagnostics  were  next  to  non-existent, 
though  m  the  majonty  of  cases  syntax  errors  m  programs  were 
detected.  Our  experience  suggests  that  compilers  should  only  he  con¬ 
structed  using  n  high  level  programming  language.  Compiler  complex¬ 
ity  can  perhaps  he  attacked  more  successfully  hy  using  modem  com¬ 
piler  writing  tools1''  ''  ihun  hy  developing  high  level  instruction  sets 
Fhe  poor  design  ol  the  Translator  wus  undoubtedly  due  in  large  part  to 
the  low-level  implementation  the  designer  was  forced  to  work  with  and 
the  infantile  stale  ol  compiler  technology  in  the  early  1%0's. 

Debugging  the  Iranslator  hardwaie  was  extremely  difficult,  js 
register  level  flow  charts  and  wire  lists  proved  to  be  a  totally  inade¬ 
quate  form  of  documenting  the  conceptual  process  of  t.-nslBtion.  In 
no  way  could  the  design,  implementation  and  debugging  ol  the 
SYMBOL'S  Translator  have  been  cost  effective  compared  to  a  compiler 
programmed  in  a  high  leve1  language.  The  hardware  dedicated  to  the 
I  ranslator  was  mu  cost  ellcchve.  as  the  logic  was  rarely  in  use  and  a 
similar  function  could  have  been  performed  by  the  Central  Processor 
Perhaps  a  more  reasonable  tradeoff  would  have  been  to  provide  the 
Central  Processor  with  special  purpose  hardware  to  aid  with  the  vari¬ 
ous  translation  functions.  This  would  have  had  Ihe  added  benefit  ol 
allowing  special  purpose  hardware  to  he  used  for  other  functions  in 
addition  to  translation 

Even  more  than  the  Translator,  the  I/O  Processor  suffered  lioir 
the  rigidity  ol  a  hardwired  implementation.  To  offload  the  Central 
Processor,  the  I/O  Processor  contained  a  hardwired  text  editor  that  ran 
extremely  quickly  I'ntortunately  the  pushbutton  operated  editot  was 
so  dittivull  to  use  and  so  primitive  that  all  on-line  editing  was  done  in 
the  Central  Processor  with  software  text  editors.  The  strict  separation 
ol  the  TO  Processor  and  the  Central  Processor  did  not  allow  the  primi¬ 
tives  in  the  hardwired  'ext  editor  lo  be  shared  by  Ihe  software  text  edi¬ 
tors 

Two  lessons  are  evident.  First,  essential  utilities  of  a  system  such 
as  a  text  editor  and  compiler  need  the  ability  to  change  ami  grow .  ixiili 
to  correct  hugs  and  to  add  new  features.  The  hardwired  approach  did 
not  allow  the  possibility  lor  this  growth  The  functional  division  was 
at  ton  gross  a  level,  eg.  the  specialized  hardware  in  the  Trunsluloi 
provided  an  all  or  none  service.  Second,  special  purp  ise  hardware  - 
made  flexible  In  modularizing  primitive  operations  so  ihcy  van  Iv  con 
nulled  hy  the  software  II  the  sequencing  ut  the  primitives  in  the  I  () 
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Processor  had  teen  controllable  by  mftware  accessible  by  the  Central  dramttically  The  greatest  benefit*  were  reacted  when  one  nxth  to 

hoc rma.  pertormme*  of  the  tat  wart  editor*  might  have  been  much  on*  fourth  of  ***  P»r  *«  rererved  tor  the  above- menbooed  expao- 

cVwrr  to  that  of  the  hardwired  text  editor.  Much  of  the  prohtem  of  uon 

SYMBOL  wee  that  the  dreignrn,  thought  the*  knew  how  users  would  Experiment*  were  performed  reducing  SYMBOL'*  page  hie 

warn  to  taw  the  machine.  When  thin  view  was  changed  even  slightly.  from  the  built  in  2K-byte*  per  page  a*  tow  at  256  bytea  per  pqp.  The 

the  hardwired  nature  of  the  Translator.  Editor  and  operating  system  use  of  smaller  pages  usually  reduced  the  paging  activity  far  a  fixed 

belted  Ifce  war  into  a  mold  he  did  nee  want  to  be  in.  main  memory  size.  Thu  technique  worked  whenever  severe  mattering 

was  encountered,  regardless  of  it*  origin.  Unfortunately,  the  ate  of 
htarmarr  Maragssnanti  A  Cnee  ef  Itraagr  Bedtolbws  small  pages  could  hurt  where  sequential  atsem  to  a  large  body  of  code 

One  of  SYMBOL'S  unique  feature*  was  its  complex  memory  «  •*“«•  *»  W***-  Furthermore,  the  com  of  the  overhead  associated 

orpmuaiion.  SYMBOL  provided  direct  hardware  support  Ixxh  fix  a  *'<h  a  large  number  of  page*  could  become  aignificam.  Although  it 

paged  virtual  memory  and  tor  dynamic  daw  structures  The  SYMBOL  »«““  have  ctxitradicted  the  declaration-free  character  of  SPL.  one 

hardware  sMffxxfi.il  the  allocution,  deletion  and  manipulation  ol  cammX  help  but  speculate  that  the  abiKty  to  requeM  contiguous  alloca- 

storage  strings.  There  storage  string*  were  constructed  by  linking  «11"  of  targe  structures  would  have  reduced  paging  comidenMy. 

lojtcther  eight-word  groups  Linked  lisas  of  such  storage  strings  were 


wed  to  reprtreut  tree  atructum  which  were  uccewed  in  SP1.  as  hctcro- 
pmcuu*  array*.  The  sixes  and  dupe*  of  there  structures  were  dynami¬ 
cally  variable. 

The  dehgnen  of  SYMBOL  foresaw  and  attempted  to  miligatc 
the  advent  interaction  at  SYMBOL'*  unique  combination  of  memory 
management  and  virtual  memory  They  realized  that  particular 
machine  functionl  hod  characteristic  memory  access  patterns.  For 
example,  the  source  code  was  used  in  prugram  editing  but  ntx  at  all 
during  execution.  In  program  compilation,  source  code  and  object 
code  were  scanned  only  tmee.  whereas  tlic  name*  tables  were  scanned 
repeatedly  Hence,  the  designers  decides!  that  each  page  shmtH  he 
ured  fia  a  single  purpose  and  that  pipe  lists  would  he  maintained  to 
regn.-gate  the  pages  acciading  to  their  ure.  When  memory  was  alks 
ealvl.  Ihc  e rads'  usage  ctasa  lie  the  needet!  space  was  specified  by  the 
hartlware.  This  usage  etas*  determined  which  page  list  the  system 
would  cumuli  to  find  the  needed  space  SYMBOL  maintained  three 
separate  page  listi:  one  fat  source  code,  another  for  object  code,  and 
Ihc  third  far  ah  other  need*.  Once  any  tptet  on  a  page  was  alkicmed. 
the  page  was  inserted  on  the  appropriate  page  list.  Henceforth,  that 
page  would  only  be  used  for  further  allocations  of  space  of  the  sumv 
usage  ciaaa.  This  scheme  worked  well  fur  prugram  editing  and  for  con¬ 
structing  name  Mbits  and  object  code  at  compile  time.  However,  at 
execution  time,  ah  data  acceaaing  involved  one  pugc  list,  so  there  was 
no  advantage  to  this  scheme  at  that  time. 

h  would  have  been  worth  while  to  experiment  with  adding  more 
page  lists  to  SYMBOL  KM*  of  puges  used  solely  fix  the  slick,  fix 
temporarire.  or  for  large  structures.  This  likely  would  have  limited  the 
scattering  of  three  objects  by  reMrichng  them  to  a  segregated  set  of 
pages.  Unfortunately,  implementation  of  additional  page  lisix  would 
have  required  extensive  modifications  throughout  SYMBOL.'*  Central 
Pnccssix.  and  hence  was  never  actually  tried. 

\n  SYMBOL,  a  tingle  large  Mructure  could  come  to  occupy  small 
portion*  of  a  large  number  of  pages.  There  was  mi  mechanism  for 
compacting  three  structure*.  Modifications  to  the  memory  allocation 
strategy  attacked  the  problem  by  preventing  some  ol  any  reclaimed 
space  on  each  page  from  being  found,  except  for  expansion  of  struc¬ 
ture*  which  already  occupied  a  portion  of  that  page.  This  was  known 
as  tht  Space  Available  List(SAL)  Threshold  technique.  '*  Measure¬ 
ment*  taken  on  SYMBOL  programs  which  had  had  significant  paging 
activity  indicated  that  this  approach  reduced  the  number  of  page  faults 


Debugging  Safi  wart  an  SYMBOL 

An  outstanding  benefit  from  the  high  level  nature  at  the  SYM¬ 
BOL  computet  was  shown  in  the  efficacy  of  the  dr  hugging  tooh19'® 
produced  for  the  tynem.  Program*  were  developed  to  alow  the  urer 
to  examine  the  Mate  of  hi*  program  in  detail  at  the  MMtee  program 
level.  For  example,  at  a  user-generated  interrupt  the  pmgramnasr 
could  ask  the  inquire  subsystem  where  the  program  waa  axacuting  and 
have  the  statement  in  execution  decompiled  far  display.  The  dccowpi- 
latiixi  process  was  remarkably  effective,  and  generally  differed  from 
(he  originnl  source  program  only  with  respect  to  spaces  and  radundnnt 
parentheses  Since  SYMBOL  was  a  descriptor  baaed  and  tagged  archi¬ 
tecture.  the  current  types  and  values  of  all  identifier*  in  the  user's  pro¬ 
gram  were  known 

There  was  never  uny  need  for  a  programmer  to  realize  that  hit 
program  was  being  translated  into  an  intermediate  form  for  execution. 
This  it  one  ol  the  Mrongest  points  for  the  daim  that  SYMBOL  waa  a 
High  Level  Language  Computer  System.21  In  addition  to  the  benefits 
that  the  machine  offered  for  debugging,  the  dynamic  type  checking 
mechanisms  in  the  hardware  proved  very  valuable  for  detecting  occa¬ 
sional  machine  error*  such  at  trying  to  uae  inatractlona  at  data  or  vice 
versa. 

Debugging  Hardware  aa  SYMBOL 

One  of  the  question*  the  implementation  of  SYMBOL  was  tup- 
posed  to  answer  waa  whether  or  not  extremely  complex  hardware  could 
he  designed  and  debugged.  The  answer  n  that  complex  hardware  can 
be  designed  and  debugged  but  only  through  the  invartmant  of  tremen¬ 
dous  effort  and  time.  In  1971  SYMBOL  was  debuggsd  to  the  point 
where  it  could  run  simple  programs,  yet  in  197*  bug*  were  still  bang 
found  in  various  processors.  The  ntuation  appears  to  he  no  different 
from  hugs  that  plague  software  years  after  a  program  it  developed, 
even  if  it  is  continuously  having  bugs  removed.  The  mrthors'  experi¬ 
ence  with  debugging  the  SYMBOL  system  and  more  conventional 
software  projects  would  suggest  that  bugs  in  hardware  occur  in  much 
the  same  way  that  they  do  in  software.  However,  the  problems  sand¬ 
aled  with  finding  and  curing  hardware  bugs  are  far  more  severe. 

Changes  to  hardware  are  more  time  consuming  than  changes  to 
software.  Modifications  to  SYMBOL  had  to  be  done  with  extreme 
care,  changes  often  hid  unexpected  side  effects  because  the  conceptual 
details  of  an  algorithm  were  not  documented  a*  they  might  have  been 


with  well  commented  software.  It  was  not  uncommon  to  cure  the 
symptom  t other  than  cure  the  problem  because  of  this  lack  of  concep¬ 
tual  documentation.  Unlike  software,  certain  changes  could  not  be 
made  because  of  physical  limitations  such  as  the  number  of  bus  pins  or 
the  number  of  1C  packages  that  would  fit  on  a  hoard.  Hardware 
errors  ami  bugs  were  not  always  deterministic  Because  of  this  non- 
determinism  it  was  first  necessary  to  ascertain  whether  a  hug  was  due 
to  an  incorrect  algorithm  or  if  a  circuit  was  failing  because  of  a  had 
component . 

Any  similar  scale  hardware  project  must  make  special  efforts  to 
provide  the  maximum  possible  effort  for  developing  design  and  debug¬ 
ging  tools.  The  state  of  the  an  in  constructing  and  debugging  digital 
systems  is  I  at  behind  the  same  technology  of  software  systems.  This  is 
probably  connected  with  the  limited  use  of  high  level  engineering  sys¬ 
tems  such  us  SCALD22  or  DRAW.23  Computer  aided  debugging  is  a 
necessity.  SYMBOL  needed  the  ability  to  trace  and  store  the  last 
several  thousand  operations  in  real  time  and  have  the  (race  informa¬ 
tion  unalyzed  automatically.  The  limited  ttace  facility  on  SYMBOL 
perturbed  the  system  sufficiently  that  some  errors  would  go  away  when 
traced,  anti  when  u  problem  could  be  traced  reliably  il  was  often 
beyond  the  ability  of  a  human  to  read  through  hundreds  ol  lines  of  hex 
bn  patterns  in  find  lire  offending  error. 

Von  Neumann  Realities 

SYMBOL  is  a  classic  example  of  a  distinctly  nnn-von  Neumann 
architecture.  Features  that  take  it  out  of  the  von  Neumann  class  are 
the  non  contiguous  memory  structure,  automatic  memory  manage¬ 
ment.  distinguishnbility  of  instructions  from  data,  the  self-describing 
nature  of  structures,  and  the  high  levtl  instruction  set.  An  early  paper 
made  the  comment  that 

as  implemented  in  the  SYMBOL  hardware,  however,  any  task 
requiring  the  variable  field  length  processing  and  storage  or  the 
dynamic  structure  featutes  of  the  language  should  show  a  consid¬ 
erable  (terformance  gain  over  conventional  software/hardwure 
systems.  3 

Bxperiencc  with  SYMBOL  suggests  that  this  is  probably  true,  but 
unfortunately  there  were  not  enough  tasks  of  this  type. 

The  reality  was  that  programs  on  SYMBOL,  as  on  most  comput¬ 
ers,  tended  to  do  relatively  simple  operations.  Arithmetic  operations 
were  mainly  adding  or  subtracting  very  small  integers;  little  use  was 
made  of  the  W  digit  precision  controlled  arithmetic.  Character  strings 
were  most  frequently  only  a  tingle  character,  and  rarely  exceeded  a 
dozen  characters  in  length.  While  some  use  was  made  of  dynamically 
variable  arrays,  arrays  were  aimoat  always  homogeneous  and  remained 
static  once  grown.  At  the  machine  level,  it  hurt  a  great  deal  that  the 
memory  structure  nnd  decimal  arithmetic  prcceaanr  precluded  indexing 
with  address  arithmetic.  Object  code,  name  tables,  and  source  files 
were  always  static  objects  after  their  ejection;  a  better  storage  organi¬ 
zation  for  these  would  perhaps  have  l>»n  a  traditional  contiguous 
linear  store.  The  mcral  of  this  story  is  that  the  traditional  von  Neu¬ 
mann  computer  it  perhaps  not  so  ill-tui'cd  lo  the  operations  actually 
performed  by  typical  program*.  The  '•i'L  language  and  SYMBOL 
hardware  were  more  powerful  than  the  average  user  required.  Some 
of  SYMBOL'S  more  advanced  (enures  could  have  been  implemented 
by  software  on  a  traditional  machine  to  achieve  a  more  cost  effective 


solution  to  the  same  problems  Perhaps  the  conclusions  would  have 
been  different  in  another  environment,  but  SYMBOL  was  not  as  much 
an  advantage  over  the  son  Neumann  machine  as  had  been  hoped  ear¬ 
lier 

Microcode 

Ihc  hardwired  nature  of  the  SYMBOL  machine  is  ollen  criti¬ 
cized  (or  its  inflexibility  Microcoding  has  been  suggested  as  an  imple¬ 
mentation  solution  that  is  flexible  and  still  efficient.  The  understand¬ 
ing  of  the  authors  is  that  during  the  60's  when  technology  decisions 
were  being  made.  ROM's  suitable  for  microcode  lacked  speed,  lacked 
density,  and  were  prohibitively  expensive  for  the  quantities  required 
for  SYMBOL.  If  one  were  to  design  the  same  procesaors  today, 
microcoding  is  obviously  superior  to  a  random  logic  implementation. 
Part  of  the  SYMBOL  experimeni.  however,  was  to  push  the  limits  of  a 
completely  hardwired  implementation,  microcode  would  not  have 
accomplished  this.  The  significant  lessons  to  be  learned  from  SYM¬ 
BOL  are  not  whether  it  should  have  been  microcoded  or  not.  hut 
rather  in  the  lessons  learned  about  system  complexity,  refinement  ol 
complex  systems,  debugging  of  complex  systems,  functional  division, 
and  instruction  set  design.  In  many  instances  system  software  needs  to 
tie  installation  mndiliuhic.  a  microcode  impi'emcntation  would  geneiallv 
not  (all  into  this  category. 

Was  SYMBOL  Really  a  HLLCS? 

It  is  crucial  lo  note  why  we  consider  SYMBOL  to  lie  one  ol  the 
lew  real  High  Level  Language  Computer  Systems.  The  SYMBOL 
machine.  with  and  only  with  the  software  developed  for  it,  meets  the 
IILLC'S  definition- 1  hecause  it: 

( 1 )  Uses  a  high  level  language  for  all  programming,  debugging  and 
other  user/system  interactions. 

(2)  Discovers  and  reports  syntax  and  execution  errors  in  'erms  of  the 
high  level  language  source  program.20 

(3)  Does  not  have  any  outward  appearance  of  transformations  front 
the  user  programming  language  to  any  internal  languages. 

Perhaps  the  most  crucial  part  of  meeting  this  definition  in  any  system 
is  being  able  to  debug  a  program  at  the  source  language  level.  The 
SYMBOL  architecture  facilitated  this  with  high  level  instructions  lhat 
ulkrwcd  object  code  to  he  easily  de-compiled  back  into  source,  and  in 
the  sett-describing  nature  of  all  data  objects  that  allowed  the  unambi¬ 
guous  interpretation  of  any  data  storage.  A  High  l-evcl  Language 
Computer  System  is  different  from  and  more  important  than  just  a 
machine  with  a  high  level  instruction  set. 

Conclusion 

The  exislencc  of  the  working  SYMBOL  computer  system  dearly 
demonstrates  lhat  a  high  'evel  instruction  set,  a  compiler,  automatic 
memory  management  and  a  major  portion  of  a  time  Shared  operating 
system  can  he  implemented  successfully  in  hardware.  Use  of  the 
SYMBOL  system  showed  to  a  letter  degree  that  the  costs  ol  Iwilding 
such  a  system  arc  not  less  than  building  an  equivalent  system  in 
software;  (hill  ihc  uliilily  lo  evolve  a  system  is  perhaps  more  impoiluin 
than  having  a  very  fast  functional  unit  that  is  never  used;  that  perftu- 
mance  gains  from  hardwired  implementation  are  easily  last. 


KYMBOI.  taught  im  a  great  deal  about  building  complex  systems 
The  tup  down  design  approach  made  it  necessary  for  the  the  entire  .vs- 
tem  lo  he  conceived  before  any  o<  it  was  implemented;  the  results 
thaw  that  this  it  dangerous.  Building  complex  hardware  i*  prone  to 
the  tame  bugs  and  fundamental  design  errors  that  pi  ague  complex 
software  tynaata.  SYMBOL  contained  many  excellent  and  unique 
solution*  tn  individual  problems  hul  the  complex  interactions  of  all  of 
these  atduduna  cumhined  to  make  the  entire  system  cumbersome  and 
slow,  ruftnemrni  and  iterative  improvement  ate  steps  that  most 
SI  4t ware  systems  must  fo  through  before  reaching  accept  ahtc  levels  ot 
performance  and  utility;  this  step  was  tlcspctalely  needed  with  SYM¬ 
BOL.  Performance  could  have  been  improved  perhaps  more  than  an 
cedar  of  altitude  if  many  of  the  known  inefficiencies  could  have 
been  tuned  or  removed.  Decile  levtral  negative  comments  in  this 
pqper.  the  SYMBOL  experience  wea  a  very  positive  first  step  in  the 
design  of  High  Level  Language  Computer  Systems 
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ABSTRACT 

This  paper  considers  the  principle  motivations 
for  a  high-level  language  architecture,  Program¬ 
mer  Productivity,  Compiler  Simplification,  and 
Run-Time  Efficiency.  Individually  /nd  collec¬ 
tively,  these  motivations  do  not  represent  com¬ 
pelling  Justification  for  a  departure  from 
conventional  architectures.  It  Is  suggested 
that  a  more  beneficial  architectural  departure 
Is  to  be  found  In  a  lower-level  micro  architec¬ 
ture  Instead  of  a  higher-level  architecture. 


INTRODUCTION 

The  t|tms  l  ion  of  the  desirability  ol  .1  high- 
level  language  architecture  was  asked  .11  the 
birth  of  the  stored  program  digital  computer 
by  Burks,  Goldstlne,  and  von  Neumann(l). 

"In  general,  the  Inner  economy  of  the 
arithmetic  unit  Is  determined  by  a  com¬ 
promise  between  the  desire  for  speed  of 
operation  --  a  non-alementary  operation 
will  generally  take  a  long  time  to  per¬ 
form  since  It  Is  constituted  of  a  series 
of  orders  given  by  the  Control  --  and 
the  desire  for  simplicity  or  cheapness 
of  the  machine." 

Over  the  years,  architectural  trade-offs  have 
been  made  In  favor  of  selective  Incorporation 
of  complex  functions  In  those  architectures 
where  performance  was  a  dominant  consideration, 
floating  point  as  an  elementary  operation  was 
provided  as  a  hardware  operation  in  the  uvld- 
1950s ”) .  A  variation  of  the  FORTRAN  00  loop 
"was  Included  In  the  COC  STAR  and  Tl  ASC  archi¬ 
tectures  In  the  1970s”),  with  vector  Instruc¬ 
tions  Included  as  elementary  operations,  the 
generation  of  addresses  Is  overlapped  with  the 
operation  Itself  yielding  Improved  performance 
and  a  reduction  In  required  memory  bandwidth 
Is  achieved  by  the  reduction  In  the  number  of 
Instruction  fetches. 

A  view  has  been  introduced  Into  the  discussion 
of  elementary  operation  selection.  This  view 
Is  an  observation  that  a  "semantic  gap"W 
exists  between  the  programming  language  and 
the  language  which  the  computer  actually  exe¬ 
cutes.  The  existence  of  a  gap  Is  an  invitation 
to  close  the  gap. 


A  recurring  Idea  is  the  high-level  language  archi¬ 
tecture  which  directly  executes  a  selected  lan¬ 
guage.  $YMB0l”,oJ  j&'  thls  type  of  architecture 
as  I*. Lb*  recently  discussed  Ada  processor  by 
Intel”).  For  many  reasons,  these  architectures, 
labeled  "Type  C"  by  Myers”),  are  deemed  Ineffi¬ 
cient.  Most  proposals  today  for  a  high-level 
architecture  embrace  some  Intermediate  language”) 
as  the  language  to  be  accepted  by  the  computer. 

Proposals  for  high-level  language  architecture 
are  based  on  achieving  three  Improvements: 

I,  Programmer  Productivity 
?.  Compiler  Simp  I  I f I  cat  ion 
I.  Run-Time  Iflitiemy 

CHOGkAMHLR  PRODUCT  I V I 1Y 

Unfortunately,  the  observation  has  been  made  that 
closing  the  gap  will  have  a  significant  positive 
impact  on  programming  cost.  This  has  had  tha 
result  of  drawing  attention  away  from  the  real 
problem  of  selecting  elementary  operations.  I 
believe  that  this  argument  proceeds  as  follows: 

1.  The  best  performance  and  the  minimum 
code  space  results  when  a  problem  Is 
programmed  In  assembly  language, 

2.  Poor  performance  and  code  space  result 
if  a  high-level  language  Is  used. 

3.  Programmer  efficiency  Is  Improved  If  a 
high-level  language  Is  used. 

Thus,  a  carefully  selected  intermediate 
execution  language,  which  can  be  compiler 
generated,  will  give  good  performance, 
reduced  code  space,  and  Increase  programmer 
productivity. 

Programming  costs  are  a  function  of  the  language 
and  the  quality  of  the  support  functions  provided. 
It  should  make  no  difference  In  programmer  pro¬ 
ductivity  whether  the  support  functions  are  pro¬ 
vided  In  hardware  or  software. 

Assistance  In  program  debug  Is  a  benefit  cited 
for  a  high-level  language  architecture”®)  which 
should  reduce  programming  cost.  I  believe  thet 
there  is  e  lesson  to  be  learned  today  from  the 
support  systems  provided  for  microprocessors. 
Program  development  Is  moving  Into  a  cross  sup¬ 
port  mode.  More  and  more  programs  are  developed 
on  a  host  which  Is  not  the  computer  on  which  the 
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program  will  execute^').  One  reason  for  this 
is  that  powerful  debug  tools  can  be  provided  in 
the  development  software.  Only  a  very  small 
subset  of  these  tools  could  be  provided  in  the 
hardware  of  a  high-level  language  architecture! 
software  support  would  still  be  needed.  Relat¬ 
ing  execution  errors  during  development  to  the 
source  program  is  enhanced  more  with  software 
tools  than  with  a  meager  set  of  hardware  capa¬ 
bilities. 

COMPILER  SIMPLIFICATION 

A  benefit  frequently  advanced  for  a  high-level 
architecture  Is  that  a  well-selected  set  of 
intermediate  level  language  significantly 
reduces  the  complexity  of  the  compiler.  This 
Is  hard  to  understand.  It  can  be  argued  that 
these  compound  elementary  operations  of  the 
Intermediate  language  can  be  defined  as  macro 
subroutines  which  the  compiler  can  easily  pro¬ 
duce.  These  macros  can  then  be  Interpreted  by 
the  machine.  Again,  this  becomes  a  question  of 
cost  and  performance.  This  "soft"  Intermediate 
level  language  architecture  yields  alt  of  the 
desirable  compiler  characteristics  as  does  a 
"hard"  architecture.  The  Burroughs  B 1 700 (12) 

Is  an  Illustration  of  this  point.  Cohen  and 
Francis!**)  describe  another  system  which 
executes  on  conventional  microprocessors. 

I  will  not  argue  that  the  specification  and 
use  of  an  Intermediate  level  language  Is  not 
beneficial  for  compiler  creation.  I  do  argue 
that  this  language,  In  total,  should  not  be 
Implemented  In  hardware.  For  those  cases 
where  an  Intermediate  language  seems  beneficial 
to  the  compilation  process,  Interpretat ion  of 
this  language  Is  completely  feasible,  although 
slow  In  execution.  The  benefits  of  reduced 
code  space,  Including  the  Interpreter,  gen¬ 
erally  are  realized. 

RUM-TIHC  EFFICIENCY 

I  perceive  that  the  semantic  gap  has  become 
highly  visible  because  of  two  factors.  First, 
the  non-computat tonal  overhead  of  structured 
programming  Is  Increasing  the  run  time  of  our 
programs,  and  second,  the  execution  of  operating 
system  functions  Is  also  consuming  a  highly 
visible  amount  of  CFU  time.  In  both  of  these 
cases,  the  root  problem  stems  from  the  lack 
of  a  few  elementary  operations  selected  to 
support  these  functions,  not  a  closing  of  a 
semantic  eap. 

Myerst’*1)  provides  an  Interesting  comparison 
of  the  concepts  of  PL/1  and  the  support  pro¬ 
vided  by  the  S)tO,  I  believe  that  In  every 
case  cited  by  Myers,  the  Issue  resolved  itself 
Into  the  need  for  the  compiler  to  generate  a 
body  of  code  which  Implements  the  PL/I  concept. 
This  1s  an  Issue  of  elementary  operation  selec¬ 
tion  and  tha  cost  perforsmnee  of  the  computar. 

The  cost  performance  of  a  computar  having  more 
complex  elementary  operations  Is  of  real  concern. 


Let  me  examine  the  reduction  In  memory  bandwidth 
resulting  from  the  inclusion  of  vector  Instruc¬ 
tions.  Myers''5)  describes  the  case  of  two  100 
by  100  element  fixed  binary  arrays  which  are  to 
be  added  together.  A  programmed  loop  would 
require  60, 004  memory  references  for  Instructions 
and  30,003  for  data,  a  total  of  70,007.  A  single 
vector  instruction  would  require  only  30,001 
(30,000  for  operands  and  one  Instruction).  An 
alternative  to  this  Is  found  In  computers  such 
as  the  CDC  7600.  which  has  a  program  buffer  cache. 
This  architecture  requires  only  eight  references 
to  main  memory  for  the  Instructions  and  30,000 
references  for  the  date.  Vector  Instructions  ere 
not  needed  to  reduce  memory  bandwidth  If  Instruc¬ 
tion  buffering  and  high  execution  rate  Is  pro¬ 
vided  for  the  elementary  operations. 

The  use  of  compound  elementary  operations  can 
reduce  the  storage  requirements  for  instructions 
due  to  the  Instructions’  higher  Information  con¬ 
tent.  In  Myer’s  example,  the  number  of  Instruc¬ 
tion  bytes  Is  reduced  from  276  to  13.  This  Is  an 
Impressive  reduction!  However,  If  the  program 
represents  20k  of  the  total  memory  requirement, 
for  example,  the  compound  elementary  operations 
can  yield,  at  best,  a  20k  reduction  In  required 
total  memory  space.  This  small  memory  savings 
may  not  be  worth  the  Increased  cost  of  the  CPU. 

Compound  elementary  operations  to  enhance  run¬ 
time  cost  effectiveness  are  provided  at  a  cost 
In  hardware,  logic,  and  microcode.  The  Justifi¬ 
cation  of  this  cost  depends  upon  the  number  of 
times  the  function  Is  executed  In  a  program; 
frequent  use  Justifies,  occasional  use  does  not. 
Figure  1  Illustrates  this  point.  The  higher  the 
cost  of  providing  a  hardware  macro,  the  larger 
the  use  factor  must  be  to  achieve  a  breakeven 
cost . 


Figure  I 


Computer  architects  can  quickly  select  most  of 
the  elementary  operations  of  their  design.  The 
Inclusion  of  more  complex  or  compound  elementary 
operations  requires  knowledge  of  the  Intended 
use  of  the  computer.  Care  must  be  exercised 
that  static  and  dynamic  statistics  collected  on 
programs  run  on  a  unique  computer  reflect  the 
true  nature  of  the  problem  and  exclude  the 
characteristics  of  the  computer1  tie) .  For  exam¬ 
ple,  cade  used  for  ruri-flne  checks  wUT  not  lie 
Identified  with  the  higher  purpose  of  the  code. 
Nevertheless,  choices  are  made  and  computers  ere 
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designed  and  built,  vdilch  are  Improvements  over 
prio.‘  designs. 

For  a  computer  which  must  be  multi  I inguel ,  that 
is,  can  be  programmed  in  many  languages,  great 
care  must  be  exercised  In  the  selection  of 
compound  elementary  operations  which  will  be 
useful  for  ell  the  languages.  The  result  of 
implementing  the  Intaneedlate  language  it. 
hardware  can  be  a  loss  of  generality.  An 
Intermediate  language  for  COBOL  Is  not  likely 
to  be  the  same  language  for  FORTRAN  or  PASCAL. 
And  what  does  one  do  when  Ada  becoeies  popular? 
Will  the  Intermediate  language  support  the 
new  programming  language  efficiently? 

figure  2  Illustrates  the  problem  which  Is 
.  .created  as  the  language  Implemented  by  the 
'hardware  approaches  the  programming  language, 
'closing  the  semantic  gap.  In  a  conventional 
processor,  the  high-level  language  It  com¬ 
plied  Into  machine  language  which  1s  Inter¬ 
preted  by  the  hardware.  As  the  machine 
..  language  approaches  the  programming  HLL, 

.the  machine  languages  will  diverge  end 
become  two  or  more  different  machine  lan¬ 
guages  If  the  semantic  gaps  are  completely 
.^closed, 
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A  writable  control  store  with  program  accass  to 
sequences  of  microcode  Is  one  technique.  This 
will,  In  effect,  provide  for  the  Interpretation 
of  tha  compound  alemantary  operations  by  micro¬ 
code.  Substantial  Improvement  In  program  execu¬ 
tion  time  can  result('8, 19,20) ,  The  compiler 
should  be  able  to  make  a  selection  of  those  com¬ 
pound  elementary  operations  which  are  Interpreted 
by  the  machine's  elementary  operations  and  those 
which  are  to  be  interpreted  by  microcode.  Pro¬ 
duction  runs  of  a  program  can  further  adapt  the 
mix  to  achieve  the  fastest  execution  rate. 

A  second  technique,  and  one  which  Is  attractive 
for  Implementation  In  VLSI,  Is  the  use  of  com¬ 
pound  function  attached  processors'll),  a  float¬ 
ing  point  chip  and  an  FFT  butterfly  chip  which 
can  be  attached  to  a  microprocessor  art  examples. 
A  Decimal  String  Chip  would  be  useful  for  a 
microprocessor  executing  a  heavy  COBOL  load. 

I  will  concede  that  there  may  ba  a  place  In 
computer  architectures  for  the  Inclusion  of 
hardware  employed  to  Improve  the  reliability  of 
software  In  execution.  The  run-time  environment 
creates  problems  which  cannot  be  anticipated  by 
the  compiler  or  require  high  checking  overhead. 
This  Issue  should  be  addressed  as  a  stand-alone 
Issue  and  should  not  ba  combined  with  the  Issue 
of  a  high-level  language  architecture. 

The  ultimata  architecture  approach  was  suggested, 
I  beltava,  by  HcKaaman  In  1967'22). 

"The  obvious  attack  for  programmers  and 
hardware  people  together  Is  to  devise 
language  that  reflects  what  we  want  to  do 
and  how  wa  do  It  (for  Instance,  In  parallel) 
and  machine  structures  effective  In  handling 
that  language.  Let  us  call  this  method 
'language  directed  computer  design.'" 
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Kavipurapu  end  Oregon  ^  are  conducting  e 
search  for  common  elements  end  their  fre¬ 
quency  of  use  In  FORTRAN,  COBOL,  end  PASCAL 
to  see  If  there  are  a  few  compound  operations 
which  will  benefit  all  thraa  languages.  I 
believe  that  thara  Is  a  good  chanca  that  a 
small  numbar  will  be  found  that,  If  imple- 
mented  In  hardwara,  will  substantially 
Improve  a  compu tar's  code  space  and  axecu- 
tion  time.  Success  In  finding  a  few  Is  not 
a  mandate  to  Implement  everything  In  an 
Intermediate  language. 

A  high-level  or  Intermediate  language  Imple¬ 
mented  In  hardware  1*  too  restrictive  and 
costly.  However,  selective  implementation 
of  a  small  sat  of  compound  elementary  opera¬ 
tions  can  substantially  Improve  the  perfor¬ 
mance  of  a  computer.  The  question  feeing 
computer  architects  today  It  not  high-level 
language  architectures,  but  erchltsctures 
which  permit  the  Inclusion  of  selected 
compound  elementary  operations  which 
match  the  use  environment  at  any  given 
time. 


In  the  future,  the  language  referred  to  by 
HeKaeman  must  mean  nonprocadural  programming 
techniques (23,2s) .  The  machine  structures  will 
be  microprogrammed  In  nature.  The  architecture 
will  be  capable  of  either  Interpreting  e  "soft" 
Intermediate  language  or  executing  a  complied 
microprogram.  With  mamory  becoming  tha  least 
costly  component,  compiled  microcode  will  bacon* 
more  and  more  cost  effective.  If  a  lower  per¬ 
formance  Is  satisfactory,  than  tha  Interpreted 
soft  Intermediate  tanguaga  can  reduce  mamory 
cost.  I  believe  that  there  Is  no  "Ideal  DEL," 
there  may  be  a  DEL  for  every  nonprocedural 
language  and  this  DEL  can  be  Interpreted  on  a 
soft  architecture  If  memory  cost  Is  to  be  mini¬ 
mized. 

CONCLUSIONS 

A  case  has  not  been  made  for  tha  creation  of 
new  architectures  which  Implement  high-level 
or  Intermediate  level  languages.  All  of  the 
benefits  can  be  achieved  without  the  loss  of 
generality  by  selective  Implementation  of  some 
compound  elementary  operations  In  callable  micro¬ 
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cod*  or  attached  processor*.  The  ultimate  archi¬ 
tecture  Mill  be  a  lower-level  one,  not,  as  many 
advocate,  a  higher-level  on*. 
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ABSTRACT 

In  thle  paper,  tha  daalgn  goals  of 
direct  execution  databaae  computers  are 
itated.  Using  an-  exlatlng  databaae  manage¬ 
ment  software  rye tea,  the  paper  attaapta 
to  show  tha  replacaaent  of  tho  aoftwara 
system  with  a  hardware  databaae  coaputer 
may  not  obtain  unlfora  performance  gains 
and  storage  savings.  This  discovery  aay 
render  tha  original  design  goals  overly 
ambitious. 

On  the  other  hand,  the  ooapllcatlng 
factors  which  hinder  the  gains  and  saving* 
may  contribute  to  the  antique  nodes  of 
database  aapageaant  of  conventional  soft- 
wara  systems.  To  this  end,  the  paper 
attempts  to  Isolate  these  factors  and 
identify  tha  aodas  of  operation  for 
consideration. 


1.  PIS ICR  GOALS 

Normally,  the  effective  use  of  a  database 
system  by  a  user  requires  the  user  to  be  familiar¬ 
ised  with  the  languages  of  the  databaae  cosqputer 
system.  There  are  essentially  two  such  languages: 
the  database  definition  language  (DDL)  and  the 
database  manipulation  language  (DHL).  DDL  allows 
the  user  (especially,  the  databaae  administrator 
or  database  owner)  it  define  the  logical  and  phy¬ 
sical  propartlaa  of  the  databais.  Logical  propar- 
tie*  of  a  database  4r«  characterised  by  tha  data¬ 
base  models  used.  For  example,  In  the  relational 

model'1 ,  tha  logical  propartlaa  of  tha  databaaa  con- 
slate  of  attributes  and  domains  (of  a  tuple), 
tuples  (of  s  relation),  primary  keys  (to  tha  tup¬ 
les)  end  reletlone  (of  the  database).  In  the  hler- 
2 

srchlcel  model  ,  the  logical  properties  consists  of 
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field  names  end  values  (of  a  segment),  sequence 
fields,  primary  and  secondary  Indices  (of  seg¬ 
ments),  segments  (of  a  type),  types  (of  a  parent- 
child  relationship)  and  relationships  (of  tha 
database).  Likewise,  there  ara  logical  propartlaa 

of  CODASYL  databases3.  By  defining  Information 
entitles  In  tarm*  of  logical  propartlaa  of  a  data¬ 
baaa  modal,  tha  user  can  capture  tha  information 
content  In  tha  database  and  make  (symbolic)  ref¬ 
erences  to  the  lnformetlon  entities. 

DDL  also  allows  tha  user  (especially,  tha 
database  designer)  to  define  tha  physical  proper¬ 
ties  of  tbp  database.  Physical  propartlaa  of  a 
databaaa  ara  those  which  deal  with  units  of  stor¬ 
age  (aay,  number  of  regas  and  page  alas),  kinds  nf 
storage  (e.g.,  noVing-head  disks  va  flxad-hsari 
disks),  ctoraga  formats  of  the  logical  entltlas 
(directory  format  for  indices,  pointer*  for  re¬ 
lated  tuples  or  ssgaMnta  and  encodings  for  re¬ 
peated  attributes  or  field  names)  and  eccoas  modes 
(e.g.,  accaas  by  direct  address  calculation,  via 
intermediate  records  or  by  way  of  directories) , 

Bacaus*  aodern  databases  are  meant  to  b* 
shared,  the  database  system  must  provide  concur¬ 
rent  accaas  and  multl-usar  operations.  DDL  of  a 
modern  database  system  must  therefor*  provlda  a 
mean*  to  allow  tha  databaaa  owner  (or  adsdnlstra- 
tor)  to  authorise  and  validate  certain  users  of 
hla  databaaa,  define  different  portions  of  tha 
database  for  different  user*  (a.  g.,  by  creating 
different  viaw*  of  tha  same  databaaa),  specify 
tha  type*  of  control  operation*  permitted  or  de¬ 
nied  on  th*  authorised  portion*,  and  place  proce¬ 
dure*  (a.g.,  programs  written  by  tha  adnlnlstrator 
or  owner)  at  th*  points  of  access  path*  to  hla 
databaaa  (aay,  at  each  fils  opening  time). 

On  tha  other  hand,  th*  databaaa  manipulation 
language  (DHL)  1*  primarily  concerned  with  the 
specification  of  search,  ratrlaval,  update,  and 
processing  requirement*  of  th*  database.  Bacauae 
th*  usa  of  data  modala  enable*  tha  Information 
content  to  be  captured  In  th*  databaaa,  tha  modarn 
DHI.  enables  the  uaer  to  addreas  th*  databaaa  by 
content  for  search,  ratrlaval,  update  and  proces¬ 
sing  operations.  Content-addressing  la  accom¬ 
plished  in  DHL  a*  expressions  of  predicates.  For 
example,  tha  following  la  a  simple  expression  of 
thra*  praclcatas,  namely,  a  conjunction  of  an 
equality  predicate,  an  Inequality  predicate  and  a 
greatar-than  predicate. 

(Type-BWLOYEE)  A  (Emp-Dept  -  TOY)  a  (Salary 
>  20,000)  which  specifies  thos*  record*  of  th* 
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employ***  who  ara  not  In  th*  toy  department  and 
hav#  aalarlaa  greater  than  20.000.  By  rafarring  to 
a pacific  attribute* ,  providing  tha  nacaaaary 
predicates,  and  (pacifying  tha  lntandad  oparationa 
in  DM.,  tha  uaar  can  manipulate  tha  databaaa 
effectively  at  varioua  granularities  of  tha  data- 
baaa  (i.a«.  at  fiald  or  attribute-vain*  pair 
level,  tup  la  or  aagmant  laval,  ralatlon  or  segment 
typo  laval.  and  ralat^ooohlp  laval). 

Tha  tool*  of  high-level  languaga  databaaa 
machine  daoignara  ara  tharafora  to  bo  abla  to  coma 
up  with  hlgh-parforaanca  and  gra«t- capacity  com- 
putar  architectures  which  alloy  diract  axaeution 
of  DDL  and  DHL  statements  of  tha  uaer  .application 
prograaa.  Diract  amadutlon  of  uaar  programs  an- 
ablaa  tha  performance  and  capacity  gains  of  tha  new 
aachina  to  ba  contributed  to  tha  uaar  In  taros  of 
hlgh-voluNs  management  and  quick  raaponae  vhlch  are 
difficult  to  achieve  in  convantlonal  aof tvare-lnden 
coaputar*  for  vary  larga  databaaa  application*. 

This  difficulty  la  dua  largely  to  the  fact  that 
convantlonal  conputara  ara  not  designed  special lv 
for  databaaa  management.  Conaequently,  verv  elaho- 
rate  aoftvara  for  databaaa  management  nust  be  pup- 
portad  on  tha  conputara.  Tha  execution  of  very 
coaplax  and  alsaabla  databaaa  nanaganant  software 
tend*  to  daplate  ay a tan  raaourcaa  and  provides  In¬ 
adequate  raaponaaa  to  uaar  applications. 

Can  wa  dealgn  diract  execution  databaaa  com¬ 
puter*?  In  other  word*,  era  there  complications 
In  reaching  our  design  goals? 

2.  IBSUE8  COMPLICATING  DIRECT  EXECUTION 


There  era  at  least  two  Issues  which  have  com¬ 
plicated  the  design  goals  of  direct  execution 
database  conputara.  Ono  laau*  is  related  to  DDL i 
the  other  la  concerned  with  DML.  These  two  iasuen 
nay  render  tha  direction  execution  of  DDL  and  DML 
statement*  for  conventional  databaaa  management 

application  ineffective.  _ 

Tha  moat  illustrative  way  to  study  these  com¬ 
plication*  i*  parhape  by  focusing  our  attention  at 
a  specific  database  nodal  and  a  certain  high-level 
■language  database  computer  dealgn.  Hare,  w* 

fecua  on  the  hierarchical  model2.  We  choose  the 
DDL  and  DML  of  IBM'*  Information  Mnrnpement  Bvstor 
4-7 

(IMS)  for  study  .  Presently,  IMS  Is  a  wldelv 
used  hierarchical  database  management  software 
eyatem.  For  databaaa  computer  hardware  JestRns  ,  we 
choose  the  database  computer  (DBC)  which  has  been 
8  9 

proposed  *  to  support,  omong  other  database  models 
the  hierarchical  database  model  of  databases. 
However,  much  of  the  findings  produced  ip  the  fol¬ 
lowing  sections  are  valid  for  other  models  and  mn- 
chinaa  which  although  not  elaborated  here,  can  be 

found  in10-11’12-13. 


Execution  of  DDL  Statements  for  Creating 

New  Databaaaa  c' 


Directly  executable  DDL  statorents  for  hier¬ 
archical  databases  must  be  available  so  that  given 
the  logical  properties  of  a  hierarchical  database, 
the  DDL  atatanente,  upon  execution,  can  automatic¬ 
ally  generate  the  physical  structure  of  the  data¬ 


base  for  storage.  Furthermore,  the  physical 
structure  generated  must  taka  full  advantage  of 
tha  strong  points  and  nay  capabilities  of  tha 
databaaa  computer . 

Let  us  ravin*  briafly  tha  Logical  proper tie# 
of  an  IMS  daeabaaa  and  present  a  (hardware)  trans¬ 
formation  algorithm  (aa  designed  for  DBC)  vhlch 
converts  tha  logical  organisation  of  «n  IM8  data¬ 
baaa  into  a  physical  structure  for  database  com¬ 
puter  storage.  Wa  will  alao  mention  briafly  aoma 
strong  points  and  naw  capabilltiaa  of  tfia  databaaa 
computer. 


JimSgPiafeHH*SIS1.r 

hierarchically  related  oagmarnt  occurrence*  (or 
simply,  easmante).  each  pfw^Lch  belong*  to  a 
segment  type.  In  the  example  Figure  1,  pageant 
tvpa  A;  tha  'toot  segment  itjpfi,  haa  three  occur¬ 
rence*.  All  other*  ara  daomi^ant  segment  types, 
each  having  a  unique  naranti  be  ament  type  and  aaro 
or  more  child  aagSMnt  types.  Soma  relationships 
among  the  various  segments  In  our  examples  arat 

A1  la  tha  paront  of  B1  and  cl. 

HI,  H2  and  II  ara  children  of  01. 

.ti  and  J2  are  twine. 

HI,  112,  II,  Jl  and.  J2  at*  descendants 
or  dependants  of  Ol'.v, 

M ,  ni  and  11  ara  annas tor*  of  Jl. 

Successive  levels  are  numbered  such  that  a  root 
r.ep.mont  in  nt  level  1.  All  segment  occurrences 
aro  made  of  ono  nr  more  fields. 

An  IMS  database  (a  traversed  in  tha  order t 
parent  to  child,  front  to  back  among  twin*  and 
Inf t.  to  right  among  children.  Th*  traversal 
sequence  for  tha  database  of  Figure  1  ie(Al,  Bl, 
Cl,  M,  D2,  D3,  F.l ,  FI,  F2,  F2,  «,  01,  HI,  H2,  11, 
Jl,  J 2 ,  A?,  A3).  Notice  that  the  traversal  se¬ 
quence  define*  a  next  segment  with  respect  to  a 
given  segment,  A  hierarchical  P|th  1*  a  sequence 
of  segments,  On*  per  level,  starting  at  th*  root, 
e.g. ,  (Al,  01,  II,  J2) . 

2.1.2  Automatic  Genorstion  of  Storage  Struc¬ 
ture,  An  IMS  database  with  the  above  logical  prop- 
art  lea  can  be  defined  In  DDL  stateawnts  which  upon 
execution  transform  the  database  Into  proper  stor¬ 
age  format  of  the  database  computer  (1.*.,  DBC). 
Hecnuse  DBC  does  not  address  physical  records  by 
locations,  location-dependant  pointer*  are  not 
used  by  DBC  for  tho  purpose  of  facilitating  hier¬ 
archical  ly  related  record*.  Instead,  physical  re¬ 
cords  are  content-addreeaad  by  DBC  provided  that 
tin-  content  of  a  physical  record,  la  presented  as 
ono  or  more  variable  length  attrlbuta-valu*  pairs, 
known  as  keywords.  Thus,  an  IMS  database  la  trans¬ 
formed  by  considering  every  IMS  aegeient  aa  a  phys¬ 
ical  record  (or,  simply,  record)  cosqioaad  of  key¬ 
words  . 

An  IMS  segment  Include*  a  sequence  field 
whenever  It  is  neceaaary  to  Indicate  th*  order 
nmong  the  twin  segments.  Since  each  eagmant  be¬ 
comes  a  record  and  no  addraas-dependant  pointers 
are  allowed,  the  database  computer  assign*  a  sym¬ 
bolic  Identifier  to  each  eagmant,  Identifying  it 
uniquely  from  all  other  segments  in  tha  database. 
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•I  ii'l  tin-  use  to  ■  ni-rlfv  logically  a  hier- 

.r.hU'iil  dituhase  am  nr. .vice  automat ic  gene r.i t i o„ 
1  tin-  dac.il'. .<  >.  can  be  readllv  realized  In  the 
•aril’. '.ire  and  he  .-xei'iired  .lirectl'*  to  vield  a  .-e-1- 
1.  .  t  ion  of  nhvsir  .1  recoids  of  keywords  for  stoi- 
'••e.  Kew.-rds  er.ahU'  the  database  computer  (htii 
t..  centent  •  i  ' 'r."  i 1 1  ’..ovdi:  Ip  database  -  t*  " 

•  out-iln  t'  e  '.  e  -vrrds .  Thus,  the' hardware  real 
i  rat  I  kii  of  I'M  statements  Indeed  utilizes  r'ae 
■strong  'mints  aid  no"  cariHHtv  of  the  ditaMs.- 
■  .  .nptit  or  . 

t'oeeve'-,  i*  .', ipn,-j 1  the  layouts  of  t*  1  ■•  .*  1  « - 
nid  t,  wo  ii.  to  fii.it  the  leu  of  rembolle  1<Vsi! 
tier',  te  rupture  the  onrettf  “ohl  Id  relationships 
“m"  Increase  th%  st.ir.iee  r. 'ml ronppt  of  the  n'i"rl 
..•i!  record-  .  '  tirth.-rmoi ,  ,  as  the  levels  of 

lilefarehv  doveloe  the  sir  rape  renul  rorr.ent  of  the 
•nhvsica!  records  ra"  increase  "exponentially", 
'this  Is  evident  b"  the  following  observation  that 
ti  t  each  rori  i’stuir.dl up  di  pendent  segment  the  niv- 
sleal  reroid  rit-t  lotludi  .id.lltlonul  storage 
••  o.ici  for; 


I'li'uie  I.  logical  Organ!  xation 
ur  III?  f'atabase 


The  symbolic  Identifier  of  a  negment  ?  I  .  .■  rout 
of  fields  conaistinp,  oft 

il)  the  symbolic  ldentlfiet  of  the 
parent  of  F ,  and 

t.’l  the  sequence  field  of  F . 

t’loei  tho  uenuenre  fields  of  different  ".■""'em 
l  vpi'H  may  use  the  same  field  name,  we  ri.t'  iot.il  I  .  ■ 
the  field  name  with  the  segment  tvne. 

The  creation  of  a  record  from  an  P'S  •ii.i-iii-iit 
ean  now  he  dt'oc  'll  shed  bv  formltt”  1- e: 

I  o  I  town : 

(1)  For  each  field  In  the  .sep.P'ent  . 
form  a  keyword  using  the  field 
name  as  the  attrihute  and  If  eld 
value  aa  the  value. 

(2)  Form  a  keyword  of  the  furri.-.  T'MT, 
suptvpe  > where  TYFF.  is  a  literal 
and  segtype  is  the  segment  ivu- 
in  consideration. 

f  )T  For  each  sequence  field  in  tin- 
symbolic  identifier  of  the  in"1." 
tnent,  form  a  keyword  us  I  no  tie 
field  name  (qualified  bv  the 
segment  type)  as  the  at  t  r  Unit  e 
and  the  field  value  an  l  lie  v.tli.i. 

For  example,  for  on  IMS  database  shown  in  I'lciii' 

.’  ■  the  attribute  templates  of  the  five  ml  led  i.upi 
"f  records  corresponding  to  the  five  segment  tvpos 
■I'c  shown  in  Figure  3.  Qualified  field  names  such 
|>H  Pvercq .  Course  1}  are  used  to  distinguish  tin 
same  field  names,  i.e.,  Course  f,  among  dlffeti-nt 
•legment  tvpes. 

2.1.1  rxccutlon  (lain  vs  Storage  ''etialt v . 
hue  to  the  simplicity  of  the  transformation  ’al'-cr- 
I  thin,  it  is  not  surprising  that  Nil.  statements. 


Ii  qtial  I  f  tea!  t  -ns  -  f  the  field 
names,  and 

■  ’)  •  "n'a.'tu'o  fields  tf  its  ancestor-'. 

I.  r  ev.iMpl'  .  If  1  cure  3  i  nhvsicjl  studenL  ppr- 

•  id  at  loth  ’  t!  1 1-.-  mils  I  .ircumdate  the  qualified 
•’..me.  Student.  I  *’if  " .  In  addition,  the  st'tdcnt 
i...ord  . 1 1 ’.-i •  mii>  t  :m  .iidi.  the  sequence  field  f  1 . . - . 
tho  il.ite  flol.ii  .o  its  p.ireiH  (i.e.,  a  certain 

. •  t  i •  •  i  i n i*  re.  'i  as  o  ke's.otil.  Since  the  'viriiii 
i'  a  child  ol  .  iftnln  course  record  whose  s.  r. 
i  .  ii.'.  ■' ielr  I-.  .purse  number  (i.e.,  Course  ■'), 
i'  i  ii'  Is  u  l‘i.vi. io'<!  in  the  student  record  wliose 

•  til'll. nte  Ceurn'  .  T'C  Inclusion  of  course 


'  ‘3"  .  i 

Visit'.  i  "Hi  I'h-  '  rl'i'.J 


•  I  i  re<| 

I  "t'l.itvse 


Hi  .■!'  i  t  it\_  _  _ . 

'Hi,}  1*Da't  •:  I'l  oc  it  ion  I  Format! 

ii,  i.i'i  ■?  indent 

i  ••  |  **.it  i"  ■  ;>iT.p  1  vamei  nr«dt  i 


:■  urn  »’« 


l,:  i  .  i  i  th  ' 

.  t.  ,  1 1  .*»•}*.•>» i  K:it  :  i  n 

(  .ii.  I-.* i  »!'f v-» 


i  •  u**’l  »•  *  r  si ,  ,  n>*l  ;un  l  i r  I  «•.»  I  I  ons  In  l  ho  stiirfi.' 

**»•!*• » r tl«  <  iht*  t  ** 4*  «*t  ri’ijtH  romnnt  of  tK 
M  I'r.irrhf  r.il  dat  i  oiv<  I  doi  ah  1  v .  On  the  «» t  ■  i 

hand  ,  t Mi'  lin-lujsion  of  nemirnre  field#  .■»#  knv\  ovds 
in  rviordn  olirMuatci  the  n i> e •  l  of  pointer  snacos 
which  were  m?eessurv  In  the  U,s‘  sepments  for  the 
purpose  of  link  ini1  nil  the  twins  of  a  River,  parent 
sequent  ial  1  •* .  Despite  :Jui*h  trade-off  of  upact**. 

nnnlvflis  ha#  ^howt  that  the  increase  may  ho  V 
nor  st.'irt  inn  at  level  4.  Similar  fiiuUn*"? 

i'ii  slump, «•  li»n«i  due  to  new  database  machine  re¬ 
quirement  arc  obtained  in  rrlrttion.nl  as  well  as 
COhASYl  modelled  that  abases , 
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It  la  not  clear  whether  It  le  possible  to  devise  a 
hardware  transformation  algorithm  which  la  aa  sim¬ 
ple  aa  the  one  mentioned  above  and  which  can  yield 
storage  gains.  Until  such  an  algorithm  la  found, 
direct  execution  of  DDL  statements  for  database 
creation  In  the  now  database  computer  environment 
nay  actually  cauaa  a  loss  in  storage. 

2.2  Direction  Execution  of  DHL  Statements 

for  Database  Transformation 

In  IMS,  the  database  manipulation  language 
(DM1,)  statements  known  as  DL/l  calls  have  the  fol¬ 
lowing  format  < 

Operation  list 

where  the  Operation  1b  one  of  Insert  (ISRT),  delete 
(DIET),  replace  (REPL)  and  get  (GET)  calls,  and 
where  the  llat  la  a  number  of  segment  search  predi¬ 
cates,  at  swat  one  per  level,  which  are  used  to 
saloct  a  hierarchical  path.  Each  segment  search 
p red lest*  Is  preceded  with  the  name  of  the  segment 
type.  Let  ns  denote  the  segment  search  predicate 
at  level  i  as  Si. 

After  each  retrieval  or  Insertion  operation,  a 
segment  la  "wtebllahad"  In  the  traversal  sequence 
of  the  XM  database.  For  a  retrieval  operation, 
this  segment  refers  to  the  segment  just  retrieved ; 
for  an  insertion  operation,  this  segment  refers  to 
the  segment  just  Inserted.  Such  a  segment  in  the 
traversal  sequence  la  termed  the  current  poetlon  in 
the  database.  Thera  era  set srol  forms  of  the  get 
call,  each  of  which  ratums  a  single  segment.  A 
set -unique  (CU)  call  retrieves  a  specific  segment 
at  level  n  by  starting  at  the  root  segment  type, 


finding  the  first  segment  at  each  level  t  satisfy¬ 
ing  ,  and  finally  retrieving  the  segment  satisfy¬ 
ing  11;  .  A  get-next  (GN)  call  starts  the  search  at 
the  current  position  In  the  database  and  proceeds 
along  the  traversal  sequence  satisfying  si  for  all 
i  and  retrieving  the  segment  satisfying  Sn. 

We  shall  Illustrate  the  manner  In  which  get- 
unique  (GU)  and  get-next  (GN)  calls  are  executed 
by  the  database  computer.  Referring  back  to  the 
IMS  database  of  Figure  2,  let  us  suppose  that  the 
DL/l  call  to  be  processed  la: 

GU  Course  (Title  -  'MATH') 

Offering  (Location  -  'CAMBRIDGE') 

Student  (Grade  «  'A') 

This  asks  for  the  first  Student  segment  of  the 
database  which  satisfies  the  predicate  Grade  “'A', 
and  which  hat  a  parent  segment  Offering  with  Loca¬ 
tion  "  'Cambridge'  whore  parent,  in  turn,  is  a 
Course  segment  with  Title  *  'HATH' .  The  call  Is 
executed  as  follows: 

(1)  Starting  with  the  first  segment 
search  predicate  l.a.,  Title  <• 

'MATH',  the  Course  segments  which 
satisfy  tha  predicate  ara  re¬ 
trieves  by  utilising  the  query 
formulated  by  tha  machine 

((Type  -  COURSE)  A  (Title  ■  MATH)) 
and  are  sorted  by  the  machine 
according  to  tha  value  of  their 
sequence  field,  l.a.,  by  the  at¬ 
tribute  Course  #. 

(2)  If  no  Courae  segment  exists,  then 
the  DL/l  call  la  unsuccessful. 

Otherwise ,  the  first  Course  seg¬ 
ment  Is  found  and  designated  as  the 
currant  Courae  segment. 

(3)  The  Offering  segments  are  than  re¬ 
trieved  with  tha  predicate  Location* 
'CAMBRIDGE'  and  sorted  by  their  se¬ 
quence  field,  t.e.,  date.  If  the 
sequence  field  of  tha  currant  Course 
segment  Is  (Course  #,  C) ,  than  the 
query  used  by  tha  machine  for  this 
content-addressing  la 

((Type  -  OFFERING)  A  (Courae  #  -  C) 

A  (Location  -  CAMBRIDGE)). 

(A)  If  no  Offering  segment  axlsta,  then 
the  currant  Course  eegment  la  re¬ 
moved  and  control  la  translated  to 
Step  2.  Otherwise,  tha  first  Of¬ 
fering  eeftment  la  designated  aa  tha 
currant  Offering  segment. 

(3)  The  Student  segments  ara  than  re¬ 
trieved  *  1th  predicate  Grade  •  'A' 
and  aortad  by  their  sequence  field, 
l.a.,  by  Bap  I.  If  the  aaquancs 
field  of  the  currant  Courae  seg¬ 
ment  la  (Course  #,  C)  and  that  of 
tha  currant  Offering  segment  la 
(Date,  D),  than  tha  query  used  by  tha 
machine  for  this  round  of  content- 
addressing  1* 

((Type  *  STUDENT)  A  (course  #  -  C) 
a  (Date  »  D)  a  (Grad*  »  A)). 


Tv--** TMChwr 
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Transaction  Requirement: 


(6)  If  no  Student  segment  exists, 
than  the  current  Offering  seg¬ 
ment  in  removed  and  control 
is  transferred  to  Step  4. 

Otherwise,  the  first  Student 
segment  le  designated  as  the 
current  Studenc  segment , 

H)  The  DL/1  call  is  successfully 
executed  and  the  current 
Student  segment  is  returned. 

It  should  be  noted  that  at  this  point  that 
the  content  of  the  work  apace  of  tha  machine  es¬ 
tablished  by  the  above  Gl)  call  may  be  used  to  ex¬ 
ecute  the  next  DL/1  call,  for  example,  to  retrieve 
the  next,  student  who  has  an  A  grade  in  a  math 
course  offered  in  Cambridge.  This  is  daplctsd  by 
the  following  get-next  (GN)  call: 

GN  Course  (Title  »  'MATO') 

Offering  (Location  *  'CAMBRIDGE') 
Student  (Grade  ■  'A') 

In  this  case,  the  relevant  segment  may  already  be 
present  in  the  work  space  of  the  machine .  The 
current  Student  seguent  is  removed  and  control  is 
transferred  to  Step  6  given  for  the  GU  call. 

On  the  other  hand,  if  the  GN  cell  is: 

GN  Course  (Title  ■  'MATH') 

Offering  (Location  -  'CAMBRIDGE') 
Student  (Grade  »  'F') 

then  only  existing  Course  and  Offering  segments 
may  he  used.  However,  it  le  necessary  that  the 
next  Student  Regnant  returned  should,  not  precede 
the  current  Student  segment  in  tha  traversal  se¬ 
quence  .  Hence,  if  the  sequence  field  of  the  cur¬ 
rant  Student  segment  le  (Bmp  #,l),  that  of  the 
current  Offering  sagmant  le  (Date,  D),  end  that  of 
the  current  Course  .segment  is  (Course  I,  C),  then 
the  following  machine  query  is  used  for  content- 
addreaaing  the  next  set  of  Student  segments: 

((Type-STUDENT)  a  (Coure*  #-C)  a  (Date-D)  a 
(Kmp  (  ;  E)  A(Gred«»P)) 

The  previously  existing  Student  segments  are  re¬ 
moved  and  control  la  transferred  to  8tap  6  given 
for  the  GU  call . 

Finally,  if  the  GN  cell  is 

GN  Course  (Title  -  'HI8TQJK') 

Offering 

Student.. 

then  no  currently!;. agisting  segments  erg  useful. 
Hence,  new  set*  .of  ^agswnte  fleet  be  retrieved,  one 
set  for  each  level;,  ■  . 

2*2.i  mi&imuJsiai JirasriJUga juj&si 

ftysutloj.  Iron  vie  ebovs  discussion,  it  le  not 
surprising  »q  leant. that  directly  executable  data¬ 
base  manipulation  (DM.)  statements  of  the  follow¬ 
ing  types  of  transactions  will  produce  the  "beet” 
performance  for  the  database  computer  over  the 
conventional  scf tware-laden  IMS  ayatam. 


(1)  Find  all  segments  satisfying 
given  predicates. 

(2)  The  predicate  at  the  root  level 
does  not  involve  the  sequence 
field. 

(3)  No  predicate  is  given  at  any 
intermediate  level - 

Example:  Find  all  those  students  who  felled 
a  mathematics  course  regardless  of  the  location  at 
which  the  course  was  offered. 

fill  course  (Title  •  'MATH') 

Offering 

Student  (Grade  ■  'F') 

Loop  GN  Course  (Title  *  'MATH') 

Offering 

Student  (Grade  *  ' F ' ) 

GO  TO  Loop 

Let  N  be  the  number  of  root  segments  (i.e., 
courses).  All  of  the  root  segments  satisfying 
the  predicate  are  content-addressed.  For  each  of 
tFeae  root  segments,  all  y  of  lta  third  level 
twins  satisfying  the  predicate,  are  then  content- 
addressed.  We  also  assume  that  these  third  level 
segments  (i.e.,  those  students  who  received  grade 
F)  are  scattered  evenly.  The  relative  performance 
is  charted  in  Figure  4.  The  entries  of  the  chart 
are  computed  as  the  ratio  of  page  accesses  (to  IMS 
segments  in  the  old  ecf tvere-ledeu  environment)  to 
block  accesses  (to  physical  records  in  the  new 
database  computer  environment) . 

Due  to  very  large  content-addressable  block 
sire  (approximately  1/2  megabytes)  and  ralatively 
small  ecquentlal-addreoaable  page  alia  (about  2 
kbytes),  this  type  of  transection  may  yield  one  or 
two  orders  of  magnitude  of  performance  gain  over 
the  conventional  system. 

2,2.2  Where  are  the  Performance  Caine?  Now 
let  us  consider  another  type  of  transaction  as 
follows : 

Transaction  requirement  — 

(1)  Find  r.  single  segment,  satisfying 
the  given  predicates. 

(2)  A  predlcste  involving  the  se¬ 
quence  field  is  given  at  root 
level. 

•{sample:  Find  the  student  with  employee 

number  SO,  taking  a  CI|  211  course 
in  Columbus.  We  note  that 
course  numbers  ere  sequenced . 

GU  Course  (CourS*  #  "'CIS  211') 
Offering  (Location  * 

1 Columbus') 

Studett  (Imp  #  -  SO) 

The  performance  gains  of  thle  type  of  transaction 
are  charted  in  Figure  5,  It  is  disappointing  to 
note  that  the  performance  of  th*  database  computer 
for  this  type  of  transection  le  not  much  better 
then  the  conventional  software-laden  system. 
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2.2.3  fgrfonMWC*  Gains  v»,  Transaction 
Typo.  By  comparing  th«  example*  presented  In  the 
previous  two  sections,  It  la  evident  that  the  new 
hardware  of  the  database  computer  will  not  yield 
significantly  better  performance  over  the  eoftvare 
systee,  If  the  user  transection  demand  records  in 
a  sequential  maimer  and  receive  them  one  record 
at  a  tlaa.  On  the  other  hand,  If  for  a  user 
transaction,  the  demanif  la  of  high  volume  end  the 
search  criteria  of  the  demand 


are  made  of  predicates  which  require  content- 
addressing  instead  of  sequential  accessing ,  then 
the  strong  points  of  the  databaas  computer  Herd-  . 
ware  can  Indeed  yield  high  performeac.  Ideally 
one  would  want  to  coma  up  with  a  design  of  high-  < 
performance  and  great-capacity  database  computer  . 
which  can  provide  affective  and  efficient  aoltor 
tlona  to  aither  low-volume  and  sequential  database I 
manipulation  or  the  high-volume  and  contdat-ap*  , 
drea table  database  manipulation.  Such  4" design  la: 
not  In  tight. 

3.  COMCLUD1WG  KPtAlOCS 

Direct  execution  of  existing  high-level 
database  definition  and  manipulation  language  con¬ 
structs  nay  not  be  dealrable.  The  underelrablllty 
Is  due  to  the  lack  of  good  database  coaputer  de¬ 
sign  for  uniform  gains  In  ator-ge  requirement  and  I 
transaction  exscutlon.  In  other  words,  epaclal- 
purpoae  database  computers  any  not  be  abla  to 
bring  about  the  high  hope  of  anticipated  through-  i 
put  gains  which_bae  been  the  design  goal  of  the 
database  coaputers  In  the  firat  place. 

Nevertheless ,  database  computers  which  are 
capable  of  directly  executing  database  definition 
and  manipulation  language  constructs  will  stay. 

Thalr  Impact  will  be  twofold.  Firat,  ucabaaa  ap- 
V  licet  Ion  progress  ng  will  change,  '‘’he  rbanga 
will  primarily  ba  prompted  by  the  advcicw  fea¬ 
tures  provided  by  the  vychlnau  which  am  not 
otherwise  adeuately  avail. ''la  In  conventional  soft¬ 
ware  systems.  For  example,  security  end  integrity 
checka  and  concurrency  controls  can  be  made  wore 
effectively  and  efficiently  Introduced  as  hardware 
mechanises.  The  use  of  hlgh-voluas  and  content- 
addressable  search  and  update  for  vary  large  data- 
baaaa  la  another  need  for  hardware  realisation 
These  advanced  features  will  allow  existing  data¬ 
base*  to  migrate  to  a  new  databaas  machine  envir¬ 
onment  with  newly  written  application  program*.  On 
the  other  hand,  thara  la  not  much  that  the  new 
machine  can  improve  for  the  old  application  pror 
grama.  However,  with  some  interfacing  software, 
the  existing  application  prograawi  can  etlll  be  run 
on  the  new  environment  without  the  need  of  program 
conversion.  It  le  hoped  that  In  the  long  run  the 
database  application  will  ba  dominated  by  the  newly 
vrltten  application  programs. 

Secondly,  the  presence  of  the  database  mach¬ 
ines  will  have  an  Important  Upset  on  the  future 
development  of  database  definition  and  manipula¬ 
tion  languages.  Despite  their  claim  of  data  in¬ 
dependence  (l.e.,  devoid  of  database  software  and 
hardware  Implementation  leeues),  the  languages 
vert  designed  with  certain  known  processing  modeu 
and  underlying  technology  of  the  time.  As  a  new 
technology  with  a  high  degree  of  parallelism  and 
content-addraaaabillty,  the  database  computer  will 
require  new  database  definition  and  manipulation 
language*  to  ba  highly  concurrent  and  associative. 
Furthermore,  the  new  languages  should  have  an  ln- 
grated  approach  to  the  specification  and  control 
of  security  and  Integrity  checks  of  database  access 
and  update.  Thus,  the  study  of  databaas  computer  1 
design  will  also  prompt  our  investigation  of  new 
DDL  and  DML  for  the  computers. 
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Abstract 


Alt  architecture  of  Implemented  hashing 
hardware  to  be  used  In  symbol  manipulation  is  pre¬ 
sented.  The  major  components  of  the  hashing 
hardware  are  a  hash  addressing  unit  and  hash  table 
memei^ea  which  can  also  be  used  as  main  memory  of 
the  system.  The  hardware  suites  use  of  parallel 
read-wet  and  comparlaon  mechanisms  of  loglc-ln 
memory  hanks.  Basic  hashing  algorithms  such  as 
search,  insertion  and  deletion  of  keys  are  real¬ 
ised  by  microprogram  control.  Performance  Im¬ 
provements  of  ranging  9  -  13  times  are  obtained 
ever  pare  software  hashing.  The  application  tech¬ 
nique*  ef  hashing  hardware  to  symbol  table  manipu- 
latlen,  property  list  handling  and  set  operations 
are  given.  The  advantage  of  hashing  over  associa¬ 
tiva  memories  In  thsae  applications  are  also  dis¬ 
cussed. 

1.  Introduction 

Hashing  plays  an  Important  role  In  speeding 
up  table  look-up  operations.  It  is  extensively 
used,  not  only  in  the  traditional  language  trans¬ 
lation,  l.e.  assembling  and  compiling,  but  In 
uymbol  manipulation  at  large,  e.g,  formula  mani¬ 
pulation,  execution  a  Lisp  dialect  ,  and 
associative  processing  . 

Although  hashing  is  the  fastest  among  known 
methods  in  the  table  searching  of  H  items  in  terms 
of  computational  complexity  (  0(1)  compared  with 
0(leg  It)  of  binary  search,  for  example),  a  con¬ 
stant  tima  factor  due  to  calculation  of  hash  ad¬ 
dress  eeeuences  la  not  email  In  software  hashing  and 
in  aoma  casts,  hashing  gives  way  to  alternative 
technique*.  Moreover,  to  avoid  rapid  degradation 
of  the  performance,  the  table  utilization  must  be 
limited  to  far  lass  than  Chat  of  the  total  capaci¬ 
ty,  say  70-80  Z. 

Ta  overcome  these  difficulties,  we  proposed 
parallel  hashing  schemes  in  which  n  independent 
hath  address  sequences  ere  used  to  access  s  hash 

Research  supported  In  pert  by  grants  in  aid  from 
Ministry  of  Education  (No.  479039)  and  Kurata 
Research  foundation 


table  organized  as  a  b  by  P  two-dimensional  stray 
(b  columns,  to  be  celled  memory  banka,  ara  ac¬ 
cessed  In  parallel) («<&)  (cf.jfig.  1),  and  pre¬ 
sented  performance  analyses.  The  results  of 
the  analytes  assured  us  of  the  average  execution 
time  of  less  than  1.18  successful  table  look-upa 
with  n-b* 4,  or  avan  1.05  with  w-b« 32  until  the 
load  factor  of  the  table  gate  as  high  ea  0.9. 

Based  or.  the  analyses,  we  realised  a  parallel 
hashing  schema  on  an  experimental  system,  to  be 
used  for  symbol  manipulation.  In  sections  2-5,  we 
discuss  the  architecture  end  the  performance  of 
the  implemented  system. 

The  fact  that  basic  hash  table  look-up  opera¬ 
tions  can  be  done  with  speed  comparable  to  single 
Indirect  addressing  encourages  more  extensive  use 
of  hashing  In  naw  areas  of  applications.  In  sec¬ 
tion  6,  we  explsln  how  ssvsrsl  importent  algor¬ 
ithms  In  symbol  manipulation  are  speeded  up  by  the 
hashing  hardware. 


2.  Initial  Design  Considerations 


Our  problem  domain  Is  symbol  manipulation 
where  tables  (data  bases)  to  be  searched  are  taken 
in  main  memory  and  accessed  by  hashing  algorlthas 
such  as  given  In  chapter  4  of  Knuth. 

Our  approach  Is 

(1)  to  build  Into  mamory-CPU  Interface  parallel 
mechanism*  of  (hash)  addressing  and  data  (key) 
comparison, 

(2)  to  incorporate  hardware  logic  to  compute  hash 
addressee  into  the  addreas  formation  unit  In 

CPU, 

and 

(3)  to  replace  the  hashing  control  saquanclng 
(traditionally  done  by  software)  by  faster 
logic.,  l.e,  microprogramming. 


Sevaral  variations  of  hashing  algorithms  ara 
known  with  rsgards  to  key  collision  and  deletion 
handling,  aptrt  from  the  choice  of  hash  functions 
Me  summarized  below  our  considerations  on  the  two 
Issues.  For  detailed  discussion,  etc  papers.  ’ 


Open  addressing  vs.  chaining  methods  for  collision 


E«J?oiution 


•  When  bits  required  for  chaining  are  rightly 
taken  Into  account,  overall  performances  of 
the  two  are  nearly  equal. 

-  The  open  method  is  more  amenable  to 
p.'irnl lcliam  of  memory  accesses  than 
chan  living . 

Hence,  the  open  addressing  method  is  selected  for 
our  Implementation. 

With  or  without  key  deletion 

•  Traditional  application  of  hashing  such  as 
symbol  table  manipulation  in  language 
translation  may  not  raqulra  handling  of  key 
deletion,  since  a  symbol  table  is  discarded 
as  a  whole  whan  compilation  (or  assembling) 

Is  over. 

•  However,  in  the  advanced  application  to  be 
discussed  In  section  6, 

key  deletion  handling  is  indispensable. 

•  Among  the  key  deletion  algorithms  based  on 
the  open  addressing  method,  an  efficient 
method  developed  In  [7]  requires  extra 
hardware  resource  in  memory  (collision  number 
counters  in  each  memory  word). 

•  In  our  implementation,  it  is  expensive  to 
Incorporate  extra  bits  in  each  word 
without  losing  the  compatibility  with 

the  turget  computer  architecture. 

The  above  considerations  lead  us  to  adopt  a  key 
deletion  algorithm  which  makes  use  of  three  states 
oF  a  memory  word,  i.o.  'delated'  (all  1),  'empty' 
(all  0)  and  'occupied*  (bit  patterns  other  than 
the  above  two  bit  patterns) . 


the  Instruction  repertoire  of  the  processor  is 
augmented  with  the  hashing  Instructions  given  in 
Table  1, 


HAU  is  further  divided  Into  three  parts;  hash 
address  generator  (HAG),  hash  code  generator  (HCG) 
and  hash  table  descriptor  unit  (HTDU),  as  shown  in 
Fig.  3.  HCC  Is  used  to  generate,  out  of  a  key  k 
bit  patterns  (hash  code)  which  are  then  Input  to 
HAG  for  the  generation  of  a  hash  address  sequence 
(h.).  HAC  implements  the  following  generation  al¬ 
gorithm  (cf .  Fig.  3)  : 

Let  <?  and  be  the  hash  code,  and  P  be  the 
sl2e  of  a  hash  table  (cf.  Fig.  1).  P  should  be 
a  prime  number.  To  generate  h  and  A)l,  we  Use  a 

mask  value  2m~^  which  satisfies  the  relation 


hQ  *■  <)  .\(2m-l),  A h  <■  o' a  (2m-l) 

If  h  >  P  h  «-  h  -P 

o  '  o  .> 

if  Ah  >  P,  Ah  <•  Ah  -  P 
if  Ah  “0,  Ah  v  1 
for  i«l ,  2 ,  . . . ,  P-1 


If  hi  >  P,  h. 


n 


.-P 

t 


HM'a  ar,  realized  by  logic-in-memory  cards, 
each  having  32  k  bytes  of  memory.  They  are  inter¬ 
faced  to  common  bus  (Unibus)  (hence  accessed  as 
main  memory  via  memory  management  unit  (MMU)),  and 
have  following  functions; 

•  parallel  read  operations  of  11M1-HM4  which  aru 
invoked  by  HAU, 

•  pattern  matching  capabilities,  which  detect 
'deleted',  'empty'  states,  and  key  matches. 


The  dificulty  with  thlr  algorithm  la  that  the 
'deleted'  words  accumulate  after  repetitions  of 
key  deletions  and  inaertiona.  It  cauaes  degrada¬ 
tion  of  the  performance,  especially  unsuccessful 
searches.  We  need  a  claan-up  operation  of  the 
hash  table;  l.e.  to  reclaim  'deleted'  words  that 
are  no  longer  in  collisions  with  other  keys  and  to 
turn  them  into  'empty'  atate,  relocating  key*,  If 
necessary.  Without  collision  number  counters, 
this  operation  muat  be  performed  with  the  aid  of 
software  (rehashing  all  the  keys  in  the  table)  in 
conjunction  with  garbage  collection.  The  hardware 
must  have  a  function  for  monitoring  the  perfor¬ 
mance  In  order  to  determine  when  to  initiate  the 
garhage  collection,  however. 


3.  Description  of  the  Hashing  Hardware 


Figure  2  shows  our  experimental  system  incor¬ 
porating  the  hashing  hardware  unit  (HU),  it  i. s 
the  implementation  of  the  model  in  Fig.  1  with 
«»1  and  ()"4  In  the  case  of  single-length  (16  bit) 
keys.  The  hashing  hardware  consists  of  two  parts; 
hash  addressing  unit  (HAU)  and  hash  table  memories 
(IlM) .  The  conventional  ALU  (16  bits)  la  micro¬ 
program  controlled.  Without  HAU,  the  syatem  can 
emulate  an  existing  mini-computer  (particularly 
«u I  ted  Tor  PUP  11).  With  the  hashing  hardware, 


Hash  table  descriptor  unit  (HTDU)  in  Fig.  3 
contains  236  table  descriptors  and  each  provides 
hash  table  base,  size,  and  tha  other  auxllliary 
information  to  be  used  in  HAG,  microprogram  con¬ 
trol  unit  and  ALU.  The  descriptor  of  each  hash 
table  can  also  be  used  to  generate  an  18  bit  ad¬ 
dress  without  the  uae  of  MMU , 

The  hashing  control  la  raallxad  by  micropro¬ 
gram  and  its  algorithm  la  discussed  In  the  next 
section. 


4.  Basic  Hashing  Algorithm 


Given  key  k,  let  4^’s  be  the  simultaneously 

read-out  key  from  bank  i,  for  i«l,2, . . . ,b. 

Wc  define  following  signals  to  be  used  In  the 
microprogram  control  unit; 


M  «  v  m2  '  • 1  ‘  •  ml) 

/•:  -  M  A  (ox  .  s2  >•  ...  ,  «b) 

U  -  H  A  (dj  '  ^2  *  '  •  •  v 

where 


100 


«£  it  tha  result  of  the  comparison  of  k.  and 
'empty ' . 

J.  Is  the  result  of  the  comparison  of  k.  and 

t  7. 

'deleted' , 
end 

Is  the  result  of  the  comparison  of  k^  and  k. 
end  m  .  era  generated  In  memory  bank  HM1. 

He  should  note  that  the  comparisons  are  per¬ 
formed  In  parallel  and  that  the  results  (M,  £  and 
D )  are  available  immediately  after  the  completion 
of  the  key  read  operations. 

Algorithm  S  (key  search) 


table.  Therefore,  the  algorithm  for  HM1  is  only 
to  repeat  the  table  look-ups  until  either  B  or  D 
becomes  true. 

Execution  of  the  hashing  Instructions  is  in¬ 
terrupted  when  the  number  of  table  look-ups  ex¬ 
ceeds  the  pra-specified  value  (ateps  not  shorn  in 
the  above  algorithms)..  Counting  the  number  of  in¬ 
terrupts,  the  hashing  software  can  monitor  the 
performance  of  the  table  look-up  operations  of  a 
particular  haeh  table;  thus  we  can  tell  when  to 
invoke  the  clean-up  operation  as  discussed  in  sec¬ 
tion  2.  Returning  from  the  Interrupt  and  restart¬ 
ing  the  instruction  le  performed  by  Instruction 
HRTI .  Instructions  on  'virtual*  keys  are  dis¬ 
cussed  in  section  6. 


Instruction  HSR  is  implemented  bv  this  Algorithm. 

Step  1*  Set  i  «•  0 

Step  2.  Compute  a  hash  address  . 

Step  3.  Access  the  hash  table. 

(M,  £'  end  0  are  available  at  the  end  of  this 
step.) 

Step  A.  It  H  than  return  the  matched  position. 

If  £  then  terminate  the  algorithm. 

(kev  k  does  not  exist  in  the  table.) 
Otherwise,  set  i  *■  {4-1,  and  goto  Step  2. 

The  key  deletion  algorithm  is  similar  to  Algorithm 
St  replace  the  first  line  of  step  4  above  with 
"If  M  then  put  'deleted'  in  the  awtched  position". 
Instruction  HSD  is  used  to  execute  the  deletion 
algorithm. 

The  kev  insertion  algorithm  which  corresponds 
to  HSI  la  as  follows i 

Algorithm  I  (key  search  and  Insertion) 

Step  1.  Set  <4-0. 

Step  2.  Cosrpute  a  hash  address  h.. 

Step  3,  Access  the  hash  table. 

Step  4.  If  H  then  the  algorithm  terminates. 

(Key  k  already  exists.) 

If  Ef\D  then  put  k  in  the  'deleted' 
position, 

and  terminate  the  algorithm. 
If  £  then  put  k  in  the  'empty'  position 
and  terminate  the  algorithm. 

If  D  then  set  t*-  the  'deleted'  position, 
set  <•*-£+ 1,  end  goto  step  5. 
Otherwise,  set  W+l,  and  goto  step  2. 
Step  3.  Coelute  a  hash  address  h. . 

Step  6.  Access  the  hash  table. 

Step  7.  If  M  then  terminate  the  algorithm. 

(Key  k  already  exists.) 

If  £  then  put  k  in  position  t 

end  terminate  the  algorithm. 

Set  i*4+l 
Go  to  step  S. 

Instruction  HHI  is  used  to  Insert  a  new  key 
that  is  known  to  be  non-existent  in  the  hash 


Key  types 

The  hardware  haa  to  cope  with  multiple-length 
keys,  since  the  keys  are  often  strings  of  char¬ 
acters,  complex  data  structures,  etc.  The  opera¬ 
tion  of  HU  is  not  affected  by  the  atrrlbute  of  the 
bit  pattern  (data  type)  other  then  the  length. 

The  basic  lengths  ere  'single'  (16  bits), 

'double',  and  'quadruple'.  Longer  keys  ere  treat¬ 
ed  either  as  'virtual'  keys  (cf.  section  6)  or  as 
lists.  Hash  tables  are  created  to  be  one  of  the 
above  types,  'pair'  (i.e.  pair  of  a  single  length 
key  and  the  associated  value)  or  'virtual'.  The 
type  information  le  put  in  the  descriptor  (obta¬ 
ined  from  the  descriptor)  by  Instruction  PTHT 
(CTHT) .  This  type  information  le  ueed  to  Invoke 
appropriate  micro  code  at  the  axeutlon  time  of 
HSR,  HCV  etc..  Note  that  for  'double'  keye,  the 
hash  table  appears  as  two-bank  (b- 2),  and  for 
'quadruple'  keys,  aa  one-bank  (b- 1). 

5.  Evaluation  of  the  Performance 

Figure  4  is  the  timing  chart  of  MSI  operating 
on  'single'  key.  The  actual  clock  periods  for  tQ, 

ty  and  tj  in  Fig.  4  era  approximately  300,  400 

and  1000  ns  respectively,  and  therfore  the  esti¬ 
mated  execution  time  (excluding  the  fetch  and 
decode  time)  of  HSR  in  the  caee  of  successful 
search  is  1 . 6+1 . 3£  micro  esc,  where  i  le  the  number 
of  hash  table  accaesee.  <  depends  upon  the  load 
factor  of  the  table  and  the  number  of  masniry 
banks .  The  values  of  i  based  on  the  theoretical 
analysis  are  given  in  references.4’*  In  the 
psrallel  hashing  schemes,  i  le  equal  to  1  mostly, 
unless  the  hash  table  is  heavily  loadad. 

Talbe  2  shows  the  timing  of  typical  runs 
which  make  use  of  HSR.  He  con  observe  the 
performance  enhancement  by  a  factor  of  tan  over 
the  software  hashing.  Similar  improvements  of  the 
performance  are  observed  in  the  case  of  the  other 
hash  instructions. 


6,  Application  of  the  Hashing  Hardware 

Although  the  hashing  hardware  Is  designed  to 
be  general  as  far  as  possible,  in  this  paper  we 
only  give  following  applications.  This  is  because 
these  are  used  in  existing  software  systems  and 
the  effectiveness  of  use  of  hashing  is  already  es¬ 
tablished.  The  hardware  replacement  of  the  hash¬ 
ing  software  algorithm  will  greatly  speed  up  the 
operations  as  observed  in  section  5. 

(1)  symbol  table  manipulation  in  assemblers  and 
compilers, 

(2)  property  list  handling, 8 

(3)  creation  of  a  unique  copy  of  data  structures 
to  enable  fast  equality  checking,2*^ 

(4)  as  a  special  case  of  (3),  hash  'cons'  in  Lisp 
for  the  sharing  of  sub-data  structures  and 
fast  equality  checking, 2 

(3)  set  operations. 9 

Symbol  table  manipulation 

Figure  5  illustrates  data  structures  of  the 
symbol  tables  to  be  used  in  conjunction  with  HU. 

In  Fig,  5,  UT1  is  the  'pair'  type  hash  table. 

When  the  key  is  16  bit ,  the  key  Itself  is  put  in 
the  key  part  of  the  hash  table.  Longer  keys  are 
accomodated  as  a  pointer  to  aome  appropriate  entry 
of  another  hash  table  (e.g.  when  a  key  is 
'double',  a  pointer  to  an  entry  of  HT2  ie  pieced 
in  HU.) 


Property  list  handling 

A  property  list  is  a  Liep  terminology . 1(5 
An  Implementation  method  as  given  in  reference10 
re  lien  on  sequential  aearch  of  Hate.  The 
method  discussed  here  Is  a  speed-up  version  of 
property  list  handling  uelng  haahing.  For  exam¬ 
ple,  the  Lisp  code  (GET  OBJECT  ATTRIBUTE)  may  be 
executed  (Interpreted)  aa 

ItSK  tl,a  ;  a  points  to  a  double-word  key 

;  consisting  of  pointers  to 
;  atoms  OBJECT  and  ATTRIBUTE, 

;  and  il  denote*  a  hash  table 
;  number. 

;  This  instruction  searches  for 
;  a  Llap  cell  constructed  by 
;  hashed  cone(OBJECT,  ATTRIBUTE) 
BNE  UNSUC  ;  If  not  In  the  hash  table, 

;  unsuccessful  search 
;  (result  In  r) 

MOV  t',a 

HCV  fc2 ,a  ;  t 2  is  the  'pair'  type 

;  hash  table,  where  the  value 
;  associated  with 
i  (OBJECT  ATTRIBUTE)  is  stored. 


Creation  of  unique  copy  of  complex  structures 


be  formatted  so  that  UU  can  handle  it.  One  way  to 
handle  the  complex  structure  is  to  make  an  abbre¬ 
viated  kay  (p.343  In  Knuth6)  or 
vCvlrtuaD-keyH  out  of  it.  How  to  make  the 
v-koy  is  in  the  realm  of  software.  To  treat  a 
v-key  as  a  proper  hash  key  is  that  of  hardware. 

In  treating  a  v-key,  we  should  note  that: 

.  creation  of  a  v -key  out  of  a  complex  structure 
is  many-to-onu  mapping, 

•  hence,  HU  has  to  cope  with  the  situation  of 
multlpla  key  matches. 

The  search  algorithm  in  a  v-key  differs  from 
Algorithm  S  in  the  following  points: 

1.  When  a  v-key  match  occurs,  it  savas 

the  current  hash  ststus  (o >  d,  m .  h,  Lh) , 

l  1  y  ► 

and  returns  tha  pointer  to 

r-key*1  (performed  by  inatruction  HCR) . 

2.  The  associated  software  checks  whether  r-keyu 
match. 

3.  If  r-key  match  occurs,  the  search  ends 
successfully . 

Otherwise,  the  search  restarts  :rom  the  next 
point  where  it  is  suspended  eft,  jr  restoring 
the  hash  status  (performed  by  instruction 
HORN) 

4.  When  A*2  becomes  true,  the  search  terminates 
unsuccessfully. 

As  s  spaclal  case,  we  consider  the  case  that 
the  key  itself  is  again  a  pointer  to  a  hash  table. 
This  la  the  case  where  a  eat  is  implamantad. 

Figure  6  shows  the  data  structure.  The  aearch 
algorithm  is  as  follows: 

1.  Compute  the  v-key  uBlng  a  symatric  hash 
function,  g 

l.e.  g(x,y)-g(y,x) ,  since  ths  ordsr  of 
elements  of  a  set  is  insignificant. 

Uso  HCR  and  find  the  v-key  match, 

3.  If  E  then  terminate  tha  algorithm 
(unsuccessful  search) . 

4.  Use  HSR  to  test  the  matches  of  each  element  of 
the  hash  tables . 

5.  If  all  the  elements  match,  terminate  the 
algorithm,  otherwise  find  the  v-key  match  by 
HGRN  and  goto  3. 


*1  When  necessary,  we  use  term  ’r(resl)-key'  to 
denote  the  key  other  than  v-keys  to  clarity  the 
difference. 

*2  Strictly  speaking,  E  ii  not  the  same  as  that 
deflnad  in  section  4,  since  the  scan  of  signals 
(e^d^may  start  from  the  bank  dlffarent  from 
1,  and  since  multiple  match  may  occur. 


In  general,  complex  structures  cannot  be  tre- 
Hl.ud  directly  by  HU,  unless  it  Is  built  up  of  uni¬ 
form  structures  such  as  lists  in  Lisp.  It  should 


He  should  aota  that  since  HU  la  imJ  recur¬ 
sively,  wa  need  to  mm  the  content!  of  tha  teapo- 
rary  storage  la  M  (i.a.  AHJ  h^) ,  besides  afeatua 

«i  aa4  la  tha  v-key  processing  (aaaoutloa  of 

HCK  and  MW) .  Hence,  «a  hose  duplicate  of  regla- 
tara  la  H  actually)  aaaa  for  r-kay  hashing  aad 
tha  othara  for  r  hay  haahlag. 

la  tha  com  of  Hat  a,  wa  aw  Jo  without 
v-kays.  MTS  la  It*.  9  Ulaet  rates  tha  a  ha  rod 
llakad  liat  oaao  true  tod  by  unique  'cotta'  by 
hashing. 

7.  *i.th . 

AltanwtlM  Toehalattaa 

To  ■— rlaa  tha  applleaticn  diacuaaod,  wa 
aaa  that  haahlag  la  wood  aaaaat tally  la  throa 
waya;  (1)  aeeociatlve  retrieval  aad  (2)  ooaattuc- 
tloa  of  a  unique  copy  of  a  data  ottuetaro  for  faat 
equality  shocking,  aad  (3)  aa  a  consequence  of  (2), 
ahariag  tha  tab- data  atructuraa  la  eoaatraetlag 
complex  atructuraa . 

Associative  retrieval  hy  haahlag  la  baaed 
upon  tha  algla-hit  property  of  kayo,  fhia  opara- 
eloa  could  ha  perforaad  by  associative  aaaorlaa 
ouch  aa  surveyed  by  Tau  aad  Fung.13  Mowawar, 
wa  coaaldar  haahlag  aora  advaatagaoua  la  out  prob- 
lea  deaala  for  the  followlag  raaeoaai 

•  Haahlag  la  baoad  upoo  coowaatioaal  KAMa 
(laadoa  daoaaa  Mawory  chlpa) , 

which  ore  alaplar  in  atrueturoa  at  gata  laval 
by  at  loaat  aaa  ordar  of  magnitude 
than  aaaoalatlwa  naatory  chlpa  a.g.  Intel  SIM, 
not  to  aaatlaa  tha  coat  parforaanea. 

•  yurtbaraara  with  tha  aaaa  laval  of 
aaadcoudMtor  tochaology  BAMa  ara  faatar 
than  aaa oclatlva  uoaorlaa; 

haaea  in  wany  applications  haahlag  la  faatar 
thaa  aaaoclatlva  procaaalag  baoad  on 
aaaociatlvo  uoaorlaa. 

•  Larger  acala  loplenantatloe  la  pooalbla  with 
our  haahlag  achaaai  tha  alaa  of  tha  tabla  la 
llaltad  auly  by  addraaa  apaca  of  tha  wain 
uauory. 

»  Full  capability  of  boat  CPU  can  bo  utlllaod 
io  conjunction  with  hath  tabla  aaaipulatloo 
with  ae  additional  hardware  coot,  at oca 
haah  tablaa  arc  realited  la  aaln  oeoory . 

•  Haaea  tha  capability  of  eeaoclatlva  retrieval 
la  aaally  Incorporated  into  axletlng 
architecture  aa  ah own  la  pravloua  aactiooa. 

•  Variaty  of  data  atructuraa  can  ba  usad  In 
haahlag,  alaca  they  ara  realised  In 

KAMa  (Mia  aaaory) 

whereas  la  associative  aaaorlaa  data 
atructuraa  would  bo  subjected  to  hardwara 
uauory  word  conflguratlou. 

Ao  for  tha  aocond  aad  third  uaa(a  of  haahlni, 
corraopoadiag  officiant  algor i thaa  (a.g.  got 
operation)  boaad  oo  aaaociatlvo  aaaorlea 


would  bo  difficult  to  develop .  Different 
approaches  to  these  application  would  ho 
eocaoaary. 

>.  Cnucludlta  An Mo 

Wa  hne  abawaho*  hashing  an  ha  iapOmunted 
by  hardwara  aad  glvn  son  llluatretlva  awgUi 

of  its  an. 

•  i  ! 

The  architecture  above  la  fig.  2  reflects 
tha  baale  result peseta  far  tha  haahlag  hardwara  aa 
given  la  [7).  It  alaa  xa  fleets  She  deal  go  oon- 
prnalaa  imposed  hy  practlael  aanl  darsM  noe  for 
tha  ipirhartil  oyataut  anh  aa  not  porfomsco, 
coopatlhlllty  with  tha  oalatlag  ayetau,  disunion 
of  tha  ays too,  ate..  Mo  brief  Ip  gfanuae  tha 
altareotlwaa  wa  ooeld  have  tain  if  eon  af  tha 
above  lloltatlan  won  raoovod. 

Let  us  taka  tha  sndotlan  of  MM  operating  ao  i 
a  '  alngla '  key  (without  hop  delot  loo),  for  aunola, 
Tha  avarqge  anntlou  xlaa  la  divided  Into  US, 

122  aad  472  whau  tha  long  factor  la  0.9,  for 
eaoory  occoaoaa,  hoy  iHgaosat,  aad  other  micro 
operation,  raapectlvaly.  This  watlo  Indicated 
that  tha  haahlag  operation  an  —wiry  Hoi  tod. 

If  tutor  aaaorlaa  or  oaahee  on  available,  tha 
apaad  af  haahlag  will  ho  further  hptowd. 

Tha  ganratlao  algorithm  of  huh  addraaa 
aaquaacaa  glvan  la  MAC  suffers  frou  aaa -uniformity 
of  Ho  and  AH,  whau  tha  alaa  of  the  table,  P  la 
not  clou  to  the  power  af  2.  The  trade -off 
between  speed  and  the  uulfoeuity  of  tha  distribu¬ 
tion  of  ao  Initial  haah  address ea  la  diacuaaod 
elsewhere.12 

Ixaadalag  the  figuraa  la  tibia  2,  wa  can,  con¬ 
clude  that  tha  choice  of  ml  (on  MAO)  aad  b-A 
(four  nonary  banka)  aaoua  to  ha  adasuata  (an 
reference,'  for  further  dlacuaeien).  however, 
with  additional  hardwara,  wa  would  have  chosen 
paraaatara  b-%  (allgh  lac  ream  of  tha  pertomace 
will  result) ,  or  H*4  with  each  aaaory  cord 
equipped  with  6A  KB.  Thao  all  the  uoaorlaa  could 
bu  usabla  for  haahlag. 

Software  which  aakaa  axtaaalva  use  of  the 
hashing  hardware  is  not  yet  coop la tad.  Pull  eva¬ 
luation  of  tha  hardware  has  to  await  for  the 
software  devwlopnat.  Tha  experience  with  tha  de¬ 
sign  and  construction  of  tha  haahlag  hardwara  will 
ba  used  to  bull  a  larger  eystaa  for  symbolic 
algebra.13  He  hope  that  the  Instruction  reper¬ 
toire  will  provide  data  to  otandardfsed  tha  haahlag 
operation  both  la  hardware  and  software.  Me  also 
hope  that  la  high  laval  lan  gouge  oachdoaa  hashing 
hardware  will  be  Incorporated  aa  m  integrated 
unit  alaca  hashing  la  believed  to  speed  up 
aseantlal  search  operations  in  laterpreter-baaed 
aystaaa  such  as  Lisp  and  o  direct  execution 
machine  for  high  level  languages . 11 
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Instruction 

function 

HSR 

Search  kay 

KGV 

Gat  value  of  'pair* 

HPV 

Put  value  in  'pair* 

HNI 

New  key  Inaert 

HSI 

Search  end  insert 

HSD 

Search  end  delete 

HCR 

Get  real-key 

HORN 

Get  reel-key  next 

HPR 

Put  real-key 

HDX 

Delete  existing  virtual-key 

HRTI 

Return  from  hash  latarrupt 

PTHT 

Put  in  haah  table  descriptor 

GTHT 

Cat  from  haah  Tabla  descriptor 

Table  1  List  of  Hashing  Instructions 


case  1H 

case  2H 

case  IS 

cast  2S 

HSR  for 

'singla'kaya 

6.1 

6.6 

5.5x10 

8.3x10 

HSR  for 

'double 'keys 

1.1x10 

1.2x10 

1.2xl02 

1 . 7xl02 

HSR  for 

'quadruple' 

keys 

1.8x10 

2.0x10 

2.0xl02 

2.3xl02 

(in  micro  sac) 


Note : 

1.  Values  are  average  execution  timings 
when  accessing  all  tha  keys  that  ara 

lHi  filled  upto  50Z  of  the  table  that  la 
initially  'empty' 

2H:  filled  upto  BOX  of  the  table  that  is 
initially  'empty'. 

Cases  IS  and  2S  ara  thoaa  obtalnad  by  sxacuting 
equivalent  pure  software  (uaing  standard  PDP11 
instructions)  hashing  algorithms  on  tha  sane 
machine. 

2.  Timings  Include  fetch  and  decode  time  and 
Interrupt  handling  time  if  Interrupt  occurs. 


Table  2  Avurugu  Execution  Timings  of  HSR 
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Figure  2  System  with  Hashing  Hardware 


Figure  3  Block  Diagram  of  Hash  Addressing  Unit 


timing'’'- 

^unlt^ 

HTOU 

HCG 

HAG 

HM 

microprogram 

control 

ALU 

time  in  total 

trap  1. 

U 

MltCt 

•  htth  tablt 

tu 

step  2 

t« 

check 

types 

2t-. 

3I-. 

Jin  ■  t' 

3n  *  t-  -  I.- 

(3-  lit.  •  t  •  it- 

(3*  ilt»  -t  (i*  lit 

14-iltn  •  I-  ■  it- 

ttep  3 

t« 

gantratt 
h«ih  cod* 

trentfer  key 
to  KR't 

ttep  4: 

t> 

. -  -  - 

.  .  . 

compute 
h»-t  bite 

occupy 
common  but 
cycle 

ttep  5: 

ti 

computa 

h 

T5CTOPV 
common  but 
cycle 

(parallel  read! 

step  8: 

compute 
hi*  hi  *  h 

hi  *■  bait 

multiple-lump 

on 

E,M 

GR>  matched 

address 

ttep  7: 

tl 

occupy 
common  but 
cycle 

(parallel  raedl 
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Abstract 

Recently  Introduced  database  Machine  proposals 
are  critically  reviewed.  A  new  architecture  for 
the  cell  processor  of  the  RAF  database  machine  util¬ 
izing  multiple  microprocessors  and  LSI  serial 
memories  Is  presented,  The  proposed  cell  processor 
designed  down  to  the  logic  gate  level,  embodies 
concepts  of  modularity,  flexibility,  and  firmware 
driven  query  processing.  The  concept  of  firmware 
execution  of  high  level  RAP  assembler  Instructions 
is  presented.  The  results  of  various  analyses  of 
the  analytical  and  simulation  models  of  the  new 
architecture  which  were  carried  out  elsewhere  are 
summarized.  Special  emphasis  is  given  to  bulk 
memories  that  nave  the  start-stop  controllability 
(like  magnetic  bubble  memories  or  RAM  arrays 
simulating  serial  access)  together  with  the 
Increases  in  functional  capability  and  performance 
obtained  by  Incorporating  such  memories. 

KEYWORDS:  DATABASE  MACHINES, ASSOCIATIVE  PROCESSORS, 
DATABASE  MANAGEMENT,  LSI  MEMORIES, 
MICROPROCESSORS,  COMPUTER  ARCHITECTURE 


Introduction 

The  idea  of  providing  backend  computers  for 
the  efficient  management  of  large  databases,  as  a 
substitute  for  the  slow  software  access  methods, 
has  received  considerable  attention  in  the  recent 
years,  The  research  efforts  spent  on  this  area 
have  got  the  deserved  recognition  with  the  two 
special  issues  of  IEEE  journals' 

In  the  last  years,  many  specialized  processors 
for  handling  the  database  management  operations  3 
have  been  proposed.  Among  these  there  are  CASSMJ 
to  process  hierarchies  and  tables,  RARES*  for 
relational  database  management  and  RAP®*6  that  has 
been  Implemented  at  the  University,of  Toronto  and 
has  also  undergone  certain  changes.  DIRECT®'9  is 
being  Implemented  at  the  University  of  Wisconsin. 
Other  proposals  Include  the  Database  Computer 
(DBC)’ “ , 1 '  as  a  backend  processor-memory  complex 
and  the  Bubble  Memory  Relational  System15. 

In  this  paper  we  will  first  survey  the  most 
recent  research  efforts  in  the  database  machine 
field  and  then  present  a  new  approach  to  the  RAP 
processor  architecture,  beyond  that  of  RAP.27  , 
utilizing  LSI  technology,  like  off- the  shelf 


microprocessors,  magnetic  bubble  memories  (MBM) , 
high  density  bulk  RAM  chips,  etc. 

Survey  of  Recent  DBM  Proposals 

Most  of  the  recent  database  machine  proposals 
have  exploited  the  advances  in  technology  by 
incorporating  microprocessors,  CCD's,  MBM's  and 
the  like. 

DIRECT  is  a  system  for  supporting  relational 
databases.  The  system  comprises  a  host  for 
interfacing  with  the  users,  a  backend  controller 
for  coordinating  the  overall  database  machine 
hardware  and  software,  mass  storage  units  for 
storing  the  database, a  set  of  query  processors, 
and  CCD  page  frames  for  holding  the  relation  pages 
that  are  being  processed. 

In  this  system,  the  query  processors  and  CCD 
paqe  frames  are  connected  to  each  other  by  util¬ 
izing  a  cross-bar  switch,  so  that  all  processors 
can  access  all  page  frames.  Although  this  cross 
bar  switch  is  much  simpler  than  the  conventional 
cross-bar  switches,  it  may  not  be  cost  effective 
and  may  also  reduce  performance  in  larger  g 
implementations  of  this  system  as  proposed  in 
with  103  processors.  This  is  because,  as  the 
number  of  processors  and  page  frames  increases, 
the  selector/decoder  networks  at  the  processor 
Interfaces  and  the  qating  networks  at  the  page 
frame  Interfaces  of  the  cross-bar  switch  grow  in 
size, thereby  introducing  extra  delays  in  the  data 
transfers  between  the  processors  and  the  page 
frames,  and  hence  decreasing  performance  consider¬ 
ably. 

Another  feature  of  the  DIRECT  system  is  that  the 
results  of  the  basic  relational  algebra  operations 
executed  by  the  query  processors  are  treated  as 
temporary  relations  and  are  written  onto  free 
page  frames  allocated  by  the  controller.  The 
number  of  temporary  relation  page  frames  depends 
on  the  number  of  query  processors  assigned  to  the 
query . 

This  scheme  increases  the  query  processor- 
controller  interaction  durinq  page  frame  processing 
because  of  temporary  page  frame  requests  and  may 
Introduce  unnecessary  page  faults  for  some  other 
set  of  query  processors  executing  another  query 
concurrently,  just  because  their  page  frames  may 
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be  assigned  to  the  temporary  relations  of  a 
higher  priority  Query.  In  this  way,  the  degree  of 
parallelism  m y  drop  seriously  because  of  the 
creation  of  temporary  relations.  The  temporary 
relations  m*y  cause  a  no re  serious  performance 
degradation  during  the  join  operations  In  which 
the  system  page  frame  resources  have  to  be 
partitioned  for  the  source  and  result  relations. 

The  join  operation  may  produce  result  relations 
with  sites  comparable  to  the  source  relation  and 
It  Is  very  likely  that  this  system  will  suffer  the 
thrashing  problem  In  the  Join  operation. 

The  Database  Computer  (DSC)  Is  a  system 
proposed  for  very  large  databases  and  a  variety 
of  data  models,  utilizing  modified  conventional 
moving  head  disks.  The  basic  system  comprises  two 
processing  loops;  the  structure  loop  for  pipelined 
processlna  of  the  keywords  and  record  Indices  and 
the  data  loop  for  actually  processing  the  database 
contents . 

One  of  the  major  drawbacks  of  this  system  Is 
Its  way  of  representing  data  as  attribute-value 
pairs.  This  scheme  of  repeating  the  attribute 
Information  wastes  a  considerable  amount  of  data 
space.  Another  drawback  Is  that  the  number  of 
processors  for  doing  the  actual  processing  Is  very 
small  compared  with  the  database  size;  thereby 
reducing  the  parallelism  that  should  be  Inherent 
In  database  machine  systems.  Furthermore,  the 
number  of  Interconnections  required  between  the 
disk  drive  array  and  the  track  Information 
processors  may  be  prohibitive  In  terms  of  cost  and 
physical  requirements  for  the  configuration 
proposed . 

The  DBC  relies  on  the  concept  of  partitioned 
content  addressable  memory  (PCAM)  for  data  accesses. 
A  PCAM  Is  one  cylinder  of  a  disk  volume  and  Is  the 
largest  amount  of  memory  that  can  be  processed 
with  the  limited  amount  of  processors.  One  PCAM 
can  be  processed  In  one  disk  revolution,  but  if  the 
qualification  for  a  retrieval  Is  complex  and/or  if 
the  data  to  be  processed  occupies  a  large  nunber 
of  cylinders,  then  many  disk  revolutions  are 
necessary  for  processing  the  data.  The  relational 
operation  of  join  Is  also  executed  in  a  very 
inefficient  manner.  First,  all  the  qualified 
domain  values  of  the  source  relation  are  retrieved 
and  then  for  each  source  value,  another  retrieval 
instruction  over  the  target  relation  is  Issued. 

This  implies  that  the  nunber  of  Instructions 
executed  by  the  track  information  processors 
depends  directly  on  the  number  of  source  domain 
values. 

The  performance  study  of  tbjs  system  In 
supporting  relational  databases'1  shows  that  a 
general  purpose  conventional  computer  performs 
better  than  DBC  for  large  relations  (e.g.  with 
20000  tuples)  with  reasonably  large  tuple  sizes. 
This  In  turn  Implies  that  this  system,  although 
designed  to  support  large  data  bases  efficiently, 
cannot  support  a  database  with  large  relations  as 
efficiently  as  a  conventional  computer  despite 
the  additional  hardware  costs  introduced, 


Furthermore,  since  this  system  relies  also  on 
the  concept  of  Index  processing  (although  In  hard¬ 
ware),  the  similar  problem  Incurred  by  the  update 
operations  on  conventional  system  Is  likely  to 
occur  In  DBC,  because  the  structure  memory  should 
be  updated  as  to  reflect  the  result  of  the  update. 

Utilization  of  MBM's  for  supporting  relational 
databases  has  been  recently  proposed  by  ChanO12. 

The  proposed  hardware  cMfrlses  MM  chips  with 
certain  augmentations  to  facilitate  associativa 
selections.  A  relation  Is  mapped  on  one  or  more 
MBM  chips  with  tuples  across  the  minor  loops  and 
the  domains  along  tht  minor  loops.  It  Is  claimed 
by  the  author  that  augmentation  of  the  MBM  chips 
with  off-chip  Indexing  loops  provides  convenient 
indexing  during  data  qualification  and  avoids 
redundant  traversing  of  disqualified  deta.  TWo 
off-chip  registers  and  a  one  bit  comparator  are 
provided  for  the  database  operations.  The 
instruction  set  of  this  system  Is  said  to  be 
inspired  from  that  of  RAP  with  minor  changes. 

The  operational  deficiencies  of  this  system 
result  from  mainly  the  following:  Since  the  hard¬ 
ware  employed  Is  substantially  small  and  simple, 
provisions  for  In-place  updates  have  not  been 
provided.  Furthermore,  tne  existence  of  only  one 
comparator  limits  parallel  comparisons  on  data, 
hence  limits  query  complexity.  Also,  the  Join 
operation  Is  handled  Implicitly  as  In  RAP,  but  only 
a  single  domain  value  from  a  source  relation  Is 
transmitted  to  the  target  relation  per  scan.  This 
mode  of  operation  may  severely  degrade  the  perform¬ 
ance  of  such  a  system  in  a  join  operation. 

The  following  sections  describe  a  restructur¬ 
ing  of  the  RAP  cell  processor  utilizing  off-the- 
shelf  microprocessors  and  bulk  serial  memories, 
especially  MBM's.  The  proposed  system  differs 
considerably  from  the  previous  designs  of  RAP. 

First,  the  hardware  structure  of  the  cell  Is 
configured  Into  a  more  regular  and  modular  structure 
and  the  hardware  complexity  In  terms  of  chip  count 
has  been  reduced  to  a  third  of  the  previous  designs. 
Secondly,  query  processing  driven  by  microprocessor 
firmware  and  utilization  of  start/stop  controllable 
memories  such  as  MBM  and/or  high  density  RAM's 
permit  highly  complex  data  qualifications  and 
highly  efficient  join  operation.  The  proposed 
system  can  be  considered  as  a  RAP. 3  system 

described  In'.  The  reader,  after  following  the 
paper,  can  draw  a  comparison  of  other  database 
machines  with  the  enhanced  features  of  RAP,  as 
also  sunmarlzed  In  the  conclusion,  Including 
especially  the  join  operation. 

The  RAP  database  machine  can  also  be  regarded 
as  a  good  example  of  a  High  Level  Language  Computer 
Architecture.  Since  the  context  of  the  present 
discussion  will  deal  with  the  architectural  aspects 
of  the  new  version  of  thq  RAF  cell  structure  and 
the  fact  that  the  basic  RAP  architecture  along  with 
its  Instruction  set  are  covered  elsewhere5 »6*7,  we 
will  be  content  with  providing  only  a  summary 
description  of  the  latest  RAP  Instruction  set  In 
Appendix-4. 
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to  the  maximum  tuple  size  of  1024  bytes.  Further¬ 
more,  other  data  types  like  floating  point  numbers 
can  be  easily  supported  without  any  extra  hardware. 
Figure-2  shows  the  format  of  the  cell  CM. 


■ration  of  the  Cell 


The  linear  array  of  subcells  provides  multiple 
buffers  (as  small  RAM's)  for  the  tuples  coming  fiom 
the  CM.  At  any  time  during  CM  circulation,  more 
than  one  tuple  can  be  out  of  the  CM,  which  may  be 
in  the  states  of  being  loaded  into  a  subcell  buffer, 
being  stored  into  CM  from  a  subcell  buffer,  or 
being  processed  in  a  subcell.  The  existence  of 
multiple  buffers  provides  the  necessary  time  for 
processing  the  tuples,  thereby  synchronizing  the 
data  move  and  data  processing  rates.  The  sequence  of 
operations  during  a  circulation  of  CM  can  be 
describsd  with  a  process/time-slot  diagram  given 
in  F 1gure-3. 


In  Figure  3,  Lj,  P;  and  ,  denote  the  load, 

process,  and  store  states  of  some  tuple  for  subcell, , 
respectively.  When  the  CM  circulation  Is  initiated 
successive  tuples  are  loaded,  via  DMA,  into 
successive  subcells  starting  with  subcell,,  until 
the  end  of  (k-l)th  tuple.  In  order  to  stay  in 


synchronization,  the  first  tuple  should  be  stored 
from  subcell,,  while  the  k  th  tuple  is  being 
loaded  4nto  subcell.,  and  the  2nd  tuple  should  be 
stored  from  subcelu  while  the  (k  +  l)th  tuple  is 
being  loaded  into  subcell,,  etc.  During  the 
circulation,  each  subcell  microprocessor  is 
initiated  for  processing  as  soon  as  its  buffer  is 
loaded  with  a  new  tuple. 

It  is  evident  that  during  the  processing  of  CM 
contents,  only  (k-2)  of  k  subcells  are  actually 
active  at  a  given  time.  This  mgy  bring  the  idea 
of  multiplexing  (k-2)  processors  among  k  tuple 
buffers  or,  in  general,  multiplexing  P  processors 
among  M  tuple  buffers  where  IfrP.  If  M  is  not  an 
integral  multiple  of  P,  then  a  general  interconnec¬ 
tion  network  (e.g.  a  cross-bar)  should  be  utilized 
to  allocate  processors  to  buffers.  If  however  M 
is  an  integral  multiple  cf  P,  then  a  simple 
but  static  interconnection  schemi  for  multiplexing 
each  processor  among  (M/P)  huffors  way  suffice. 
However  in  both  cases,  besides  the  interconnection 
complexity  introduced,  the  important  feature  of 
CM  wait  time  utilization  (to  be  described  later) 
cannot  oe  made  possible. 

After  pointing  out  this  alternative  to  the 
original  k-parallel  microprocessor  approach,  the 
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paper  will  continue  dealing  with  the  dedicated  k 
parallel  microprocessor  approach  to  elaborate  on 
tin*  wait  schemes  and  to  preserve  the  modularity  of 
the  cell  architecture. 

As  it  was  pointed  out  in  the  processing 

time  allocated  for  a  subcell  after  its  tuple  is 
loaded  is 

tbr  =  (*-2)*Tls 


where  k  is  the  number  of  subcells  in  a  cell  and  1 , 
is  the  OMA  load/store  time  for  a  tuple.  It  should 
no  noted  that  the  allocated  time  depends  on  the 
tuple  size  and  is  larger  for  longer  tuple  sizes. 

In  any  case,  the  worst  case  expected  processing 
time  should  be  less  than  or  equal  to  TpR  for  a 
given  tuple  size  so  that  synchronization  is  not 
lost.  This  constraint  puts  very  high  demands  on 
the  subcell  microprocessor  performance  and  on  the 
number  of  subcells  k  (increasing  k  increases  the 
allocated  time)  if  the  CM  cannot  be  controlled  in 
•i  start/stop  fashion  (as  would  be  the  case  with 
lotating  devices  or  CCD  memories).  Furthermore, 
this  constraint  limits  the  functional  capability  of 
tne  subcell  by  restricting  the  complexity  of  query 
qualification  expressions. 

Hie  proper  use  of  the  start/stop  feature  of 
MBM's  (nr  asynchronous  access  feature  of  bulk  RAM's) 
relievos  the  above  constraints,  so  that  hardware 
!■<■» ameters  can  stay  within  feasible  limits.  This 
is  allowed  in  such  a  way  that  no  performance 
degradation  for  average  processing  times  occurs, 
while  longer  processing  times  corresponding  to  more 
complex  qualification  expressions  impose  a  certain 
dynamic  performance  degradation  which  can  be  traded 
off  with  the  issue  of  minimizing  hardware. 
Furthermore,  it  is  observed  that  in  the  execution 
of  thc*5relational  join  operation,  handled  implicitly 
in  RAP3  where  a  target  relation  (domain)  value  is 
matched  disjunctively  against  an  array  of  source 
relation  (domain)  values,  the  deliberate  imposition 
>  f  •  i:‘s  on  CM  (by  stopping  CM  whenever  necessaiy, 
induces  the  overall  time  to  execute  the  join 
operation.  This  point  will  be  detailed  in  a 
f.iMawinq  section. 

T  igure-4  shows  the  process/time-slot  distribu¬ 
tion  f  a*  a  controllable  CM  and  for  k  =  4,  The 
basic  idea  behind  the  utility  of  the  start/stop 
feature  of  controllable  memories  can  be  stated  in 
1  he  following  way:  when  the  time  comes  to  store  a 
tuple  from  a  subcell  buffer  (e.g.  storing  subcell, 
while  loading  subcelk)  if  that  subcell  nas  not 
yet  asserted  that  the  processing  of  the  tuple  is 
complete,  the  CM  is  put  temporarily  in  a  wait 
state  to  allow  for  the  completion  of  processing. 

1  he  extra  time  requested  by  a  subcell  becomes  also 
available  to  (k-2)  succeeding  subcells  so  that  the 
chance  that  they  will  impose  further  waits  is 
highly  reduced.  V>  analysis  of  the  timing  of 
operations  fur  is  case  is  presented  in  Appendix  1. 

Functions  of  the  Basic  Hardware  Modules 


The  hardware  modules  given  in  Figure-1  have 
the  following  functions: 
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SU8CELLs:  They  process  the  tuples  loaded  into 
iheir  buffers  by  the  DMA  CONTROLLER.  The 
processing  is  driven  by  a  query  routine  loaded 
into  SUBCELL  memories  prior  to  the  initiation  of  a 
RAP  instruction. 

DMA  CONTROLLER:  This  module  controls  the 
simultaneous  bidirectional  data  transfers  between 
the  cell  memory  and  subcell  buffers  during  the 
load/store  operations.  It  also  sequences  the  load / 
nrocess/store  operations  and  keeps  track  of  the 
cell  CM  status . 

BUSES:  There  are  four  buses  that  provide  data, 
address  and  control  paths  between  the  cell  modules 
during  data  transfers. 

CELL  INTERFACE:  This  module  coordinates  the 
overall  cell  operation  during  instruction 
initiation  and  termination ,  keeps  track  of  cell 
status,  and  provides  for  the  communication  of  the 
cell  with  the  RAP  array  controller. 

Query  Execution 

In  the  new  architecture,  the  microprocessors 
of  the  subcells  in  each  cell  are  the  basic  data 
processing  units.  Therefore,  these  microprocessors 
can  be  programmed  to  execute  RAP  instructions''  •” 


The  basic  idea  behind  the  emulation  of  RAP 
instructions  with  microprocessor  routines  is  that 
each  RAP  instruction  can  be  mapped  into  what  is 
called  a  "query  routine".  The  basic  RAP  instruc¬ 
tion  constructs  (I.e.  MARK,  RESET,  MKED,  UNMKED, 
updates,  set-function  computations,  comparisons 
etc,  )  have  simple  microprocessor  code  equivalents. 
Furthermore,  the  combination  of  the  results  of 
various  qualification  tests  as  disjunctions  or 
conjunctions  (or  mixed  which  was  not  available  in 
the  previous  designs)  can  be  embedded  into  the 
sequential  logic  of  the  microprocessor  query 
routine.  This  mapping  brings  considerable 
enhancements  to  RAP  capabilities,  since  now, 
qualification  complexities  are  limited  only  by  the 
subcell  microprocessor  program  memory  size  instead 
of  the  static  hardware  registers  of  the  previous 
designs  .  Furthermore,  since  the  whole  tuple  can 
be  accessed  during  processing,  domain  to  domain 
comparisons  and  updates  are  also  made  possible. 

An  example  of  a  query  routine  is  provided  in 
Appendix  3. 

The  subcell  microprocessor  memory  comprises  two 
parts.  The  ROM  part  contains  the  basic  qualifica¬ 
tion  evaluation  routines  (i.e.  numeric  and  non¬ 
numeric  value  comparisons)  and  routines  for  the 
relational  join  and  free  variable  operations.  The 
RAM  part  is  logically  partitioned  into  two  parts: 
one  for  the  query  routines  and  communication 
buffers,  and  the  other  for  the  tuple  to  be  processed. 


Before  the  initiation  of  a  RAP  instruction, 
the  equivalent  query  routine  and/or  necessary 
parameters  are  loaded  into  the  RAM's  of  the 
subcells  of  all  the  cells  involved  In  the  instruc¬ 
tion,  after  the  cell  interfaces  connect  their  cell 
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buses  to  the  buses  of  the  RAP  controller. 

Each  tlm  a  CM  circulation  is  started  and 
whenever  a  new  tuple  is  loaded  into  a  subcell 
buffer,  the  microprocessor  is  forced  out  of  the 
idle  state  to  branch  to  the  query  routine.  At  the 
end  of  processing,  a  hardware  flag  is  asserted  to 
signal  the  DMA  CONTROLLER  so  that  the  tuple  can  be 
stored  back. 

The  cell  Interface  is  also  controlled  by  a 
microprocessor,  which  after  each  RAP  instruction 
is  executed  on  the  CM  contents,  polls  each  subcell 
and  updates  the  cell  status  and  computes  (if 
applicable)  cell  set  function  subresults. 

Execution  of  the  Implicit  Join  Operation 

The  Important  and  frequently  encountered 
database  operation  of  join,  is  done  implicitly  in 
RAP* ' J 1  by  the  cross-mark  type  commands.  This 
operation  Is  accomplished  by  extracting  the  qualified 
source  domain  values  from  the  source  relation 
cells  and  transmitting  them  to  the  target  relation 
cells  until  all  source  (master)  relation  cells  are 
processed.  The  execution  of  this  operation  had  to 
be  made  as  efficient  as  possible,  because  it  was 
practically  the  only  case  where  the  superiority  of 
the  RAP  system  to  conventional  systems  was  estimated 
as  to  be  less  than  10-fold13. 

The  new  architecture  employs  a  similar  scheme 
for  this  operation.  The  values  from  qualified 
tuples  of  the  first  source  relation  cell  are  read 
out.  and  buffered  at  the  RAP  controller,  then  a 
block  of  source  values  are  loaded  into  target 
relation  cell  subcells  and  these  cells  are 
Initiated  for  processing,  This  block  loading  is 
repeated  until  all  of  the  buffered  source  values 
are  processed;  then  the  next  source  relation  cell 


values  are  buffered  and  the  above  operations  are 
repeated  until  all  source  relation  cells  are 
processed. 

The  number  of  source  values  loaded  Into 
target  relation  cells  per  circulation  depends  on 
the  size  of  RAM  space  of  the  subcell,  and  In  the 
current  design,  400  2-byte  numeric  domain  values 
(equivalently  200  4-byte  numeric  and  a  total  of 
800  bytes  of  non-numeric  domain  values)  can  be 
loaded  and  matched  against  a  single  target  value. 
This  number  compared  with  3  to  5  of  previous  RAP 
designs  shows  a  significant  Improvement  In  the 
execution  of  the  join  operation.  (The  Improvement 
however  Is  not  as  much  as  the  ratio  of  the  loading 
factors  due  to  the  differences  In  the  architectures 
and  the  fact  that  the  cross-mark  operation  Is  now 
broken  into  discrete  steps  each  starting  at  a  new 
revolution  (l.e.  a  repeated  MARK  Instruction)). 

A  snapshot  of  cross-mark  execution  Is  provided  In 
F iqure-5. 

It  Is  evident  that  processing  that  many  source 
values  imposes  waits  on  the  CM  and  hence  Increases 
the  overall  circulation  time.  However,  It  Is 
observed  that  (in  Appendix-2),  If  n  Is  the  number 
of  source  values  that  can  be  processed  without 
imposing  any  waits,  loading  mxn  (m  >1)  source 
values  per  circulation  will  reduce  the  number  of 
circulations  by  (1/m)  while  the  Increase  In  each 
circulation  time  of  the  target  relation  cells  will 
be  significantly  less  than  m-fold,  because  of  the 
parallelism  In  the  cell.  In  this  way,  the  overall 
time  to  process  a  source  cell  with  mxn  values 
loaded  per  circulation  will  be  less  than  the 
overall  time  with  n  values  loaded  per  circulation. 

Features  of  the  New  Design 

The  new  RAP  cell  processor  based  upon  the 


Source  relation  cells 


Target  relation  cells 


The  join  domain  va  lues 
are  read  out  from  each 
source  cell  and  buffered 
at  the  control  ler 


The  entire  target  relation 
is  scanned  completely  in 
one  memory  circulation  time 


Figure-5  Execution  of  the  cross-mark  instruction 
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concepts  presented  above  has  been  designed  down  to  to  reflect  the  average  case.  The  second  distrl- 

the  gate  level,  together  with  the  necessary  micro-  butlon  had  a  mean  of  ltfOO  y  sec  with  125  y  secs 

processor  query  routines  for  the  general  RAP  and  2000  y  secs  as  the  bounds  to  model  heavily 

Instruction  constructs18.  loaded  processing  sessions  as  would  be  In  a  join 

operation.  It  was  further  assumed  that  the 

In  order  to  arrive  at  a  decision  for  the  controllable  memory  array  (16  bit  wide)  could 

number  of  subcells  to  use.  various  simulation  deliver  data  with  up  to  a  600  K  Words/sec  rate, 

studies  were  carried  out'7'18.  Tuple  processing  The  results  of  these  experiments  are  provided  In 

times  were  sampled  from  two  exponential  Figure-6, 

distributions.  The  first  distribution  modeled 

processing  times  as  to  have  a  minimum  of  25  y  sec,  It  was  decided  that  k:4  would  be  a  cost- 

a  mean  of  125  y  secs  and  a  maximum  of  500  y  secs  effective  choice  to  reduce  hardware  complex  ity 


(a)  exponential  process in|  time  distribution 


(b)  Ixponantial  processing  time  distribution  for  CROSS  MARK 
KIM  1 12  J  yaec  MUM:  lOOO  yssc  MAX:  2000  usee 


figure  -  6  :flot  of  normaliied  processing  titou  vs  k 
(  data  rate  ae  parameter  ) 
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and  impose  practically  no  waits  fsr  the  average 
processing  times  at  the  memory  rate  of  300  K 
Words/sec  (  5  M  bits/sec)which  is  attainable  hv 

the  current  MBH's. 

The  cell  design  utilizes  4  subcells  where 
each  subcell  contains  an  Intel-8086  microprocessor 
with  2  K  bytes  of  RAM  and  1  K  bytes  of  ROM  and 
some  additional  control  logic.  Total  chip  count 
per  subcell  is  20.  The  ceil  memory  interface  is 
configured  for  16  x  92  K  bit  M8M's  but  can  easily 
be  modified  for  other  types  of  MBM's  and/or  bulk 
RAM's  (The  reader,  although  not  implied  in  the 
paper,  should  not  be  disillusioned  by  the  fact 
that  other  types  of  bulk  serial  or  block 
addressable  memories  cannot  be  supported.  They  can 
be  with  the  exception  of  not  having  the  further 
performance  gains  achievable  by  the  controllability 
feature.  The  architecture  could  also  be 
conceptualized  as  having  a  bulk  RAM  memory  with  a 
single  microprocessor  similar  to  the  original  desitp. 
However,  the  speed  to  be  imposed  on  a  single 
microprocessor  will  be  beyond  those  conjectured 
for  the  future  at  least  at  the  cost  effective 
scales.  Cost  of  RAM's  would  be  another  issue  which 
must  be  cheap  and  competitive  despite  their 
volatility).  The  total  chip  count  of  this 
configuration  Is  160  per  cell  which  is  slightly 
over  one  third  of  that  of  the  previous  designs, 

It  should  be  emphasized  that  utilization  of 
8086 's  is  a  specific  case  of  the  implementation  of 
the  proposed  architecture.  In  fact,  besides  the 
large  data  bandwidth,  only  the  powerful  string 
operation  instructions  and  a  suitable  subset  of  the 
remaining  general  purpose  Instructions  of  the  8086 
are  utilized  for  implementing  the  subcell  firmware, 

In  a  possible  large  scale  commercial  Implementation, 
a  special  purpose  microprocessor  with  only  the 
necessary  instructions  can  be  developed  and  utilized. 
Depending  on  the  cost  versus  speed  trade-offs,  it 
is  also  possible  to  Implement  the  proposed 
architecture  with  powerful  8-bit  microprocessors 
having  fast  block  operations. 

In  memory,  the  CM  data  rates  can  be  as 
high  as  technology  permits,  For  example,  the  8086 
based  system  can  support  a  16  M  bit/sec  burst  data 
rate  for  low  to  medium  complexity  qualification 
terms  of  RAP  Instructions  without  any  serious 
performance  degradation  due  to  the  utilization  of 
waits,  It  may  be  concluded  that,  it  is  the 
limitations  of  controllable  memories  (e.y.  MBM's) 
that  will  be  the  determining  factor  for  the 
terminal  speed  of  the  proposed  architecture, 

The  simulation  studies  and  analytical  ..idelinq 
of  the  cell  operation  show  that  considerable 
performance  Improvements  over  previous  RAP  designs 
can  be  attained.  It  has  been  observed  by 
simulation'8  that  the  new  processor  performs  3-6 
times  better  than  the  previous  designs  despite  the 
fact  that  a  larger  and  slower  memory  Is  being 
incorporated. 

The  join  operation,  which  has  not  been 
empnasized  (from  a  performance  point  of  view)  in 
other  database  machines,  can  be  performed  rather 


efficiently,  because  a  larger  number  of  values  can 
be  matched  during  each  circulation. 

Furthermore,  since  all  the  cell  status 
information  is  kept  by  a  microprocessor  at  the  cell 
interface,  task  switching  in.a. preemptive  resume 
multiprogramming  environment8*'5,  requires  no 
extra  hardware.  Relation  status  saving  and 
restoring  are  accomplished  by  the  two  new, BAP 
instructions  SAVF. -MARKS  and  RESTORE-MARKS'9 *zo 
which  save  and  restore  tuple  mark  bits  into  and 
from  special  domains  appended  to  the  end  of  each 
tuple  that  serve  as  a  push  down  stack  during  task 
switchings. 

The  overall  RAP  system  configuration  with  the 
new  processor  architecture  would  be  similar  to 
previous  RAP  configurations6*7,  only  that  the 
controller  for  the  cell  array,  which  is  currently 
being  designed,  is  expected  to  be  a  more 
intelligent  unit,  Its  main  functions  will  to  be  to 
keep  track  of  device  status  by  maintaining 
necessary  relation  and  cell  status  tables, 
instruction  scheduling  for  a  RAP  query  whose 
instructions  have  been  converted  to  microprocessor 
code,  data  buffering  in  join  operations,  control 
of  hardware  and  software  iterative  instructions, 
computation  of  overall  set  function  results  and 
communication  with  the  frontend  computer.  It  1s,5 
also  expected  to  do  the  functions  of  the  monitor  5 
for  the  RAP  multiprogramming  and  virtual  memory 
operations.  The  entire  cell-array  controller 
configuration  will  be  driven  by  e  conventional 
frontend  computer  to  interface  the  users. 

Conclusion 

After  a  survey  of  recent  database  machine 
proposals,  a  new  architecture  for  the  RAP  database 
machine's  cell  processor  is  presented.  The  new 
architecture  has  certain  advantages  over  the 
previous  hardwired  RAP  designs.  Mainly,  the 
hardware  complexity  is  decreased  while  the  opera¬ 
tional  flexibility  is  increased.  The  utilization 
of  LSI  components  opens  the  way  for  the  modularity 
of  the  architecture.  The  utilization  of 
controllable  memories  also  relieves  the  architec¬ 
ture  from  the  constraints  of  worst  case  timing 
requi rements . 

From  a  feature  comparison  point  of  view  the 
proposed  architecture  has  the  following  properties 
one  or  more  of  which  are  not  shared  by  the  other 
database  machines: 

a)  Data  qualifications  of  any  complexity  can 
be  evaluated  over  the  memory  contents  in  one 
circulation  of  the  memory, 

b)  All  kinds  of  updates  and  arithmetic 
operations  can  be  done  on  the  memory  contents 
without  transfering  data  in  and  out  of  the  RAP 

system. 

c)  Join  operation  is  handled  In  a  very 
efficient  manner.  In  most  of  the  typical  cases, 
one  target  relation  cell  memory  circulation  may 
suffice  to  process  the  values  of  one  source 
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relation  cell,  compared  to  the  large  nunter  of 
circulations  (or  revolutions)  required  In  the 
other  database  machine  proposals. 

d)  Since  no  software  access  methods  arc 
utilized,  no  overhead  on  the  frontend  computer  is 

imposed. 

e)  A  multiprogramming  environment  can  be 
attained  without  any  extra  hardware. 

f)  It  Is  expected  that  a  single  RAP  database 
machine  Is  going  to  be  confined  within  certain 
practical  physical  limits.  In  order  to  support 
very  large  database  applications,  either  one  or 
combination  of  the  following  two  system 
configurations  can  be  Incorporated: 

c  \A 

1)  Virtual  memory  back  up  as  In  ’  for  a 
single  processor 

V)  The  database  can  be  distributed  in  a 
network  of  RAP  database  machines  and  a  given 
database  operation  can  be  decomposed  and  executed 
on  the  network  of  modest  size  RAP's  concurrently, 
as  shown  by  a  previous  study22. 

RAP. 3  prototype  Implementation,  along  with  Its 
already  operational  software, Is  nearing  completion 
at  the  MEHJ . 


plus  the  capacities  of  the  (k-1) 
subcell  buffers). 

twait  -  time  during  which  the  cell 

memory  Is  in  the  wait  state  In  a 
circulation . 


Then  we  have  the  following  relationships: 

1-1 

-  Ik-Z)*T|-+  l  w,;( i«l ... .  ,NT 
Ls  j.1-(k-2)  j 

j  >  k-2  and  wQ.0) 


T, 


‘WAIT 


‘REST 

VT0TAL 


whert>  w.  are  the  wait  times 

associated  with  tuple,  (ref .Figure  4). 
NT  J 

l  w, 

Jel  J 


-  (k-1)  T 


LS 


-  nt*tls+twmt+trest 
■  (NT+k-1)  *  tls  +  TwAIT 


It  should  be  noted  that  TyAIT  Is  dependent 
on  the  complexity  of  the  query  routine  (if  k  and 
TB1T  >r *  ^xed)>  but  an  upper  bound  on  TT0TAL  c4n 
we  derived  as  follows: 
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Appendix  1 

Timing  Analysis  of  Cell  Operations 


The  following  analysis  describes  the 
relationships  among  certain  timing  parameters. 

Let 

TBn  =  CM  bit  time  -  CM  shift  tlsa/16 

U'PLEN  s  length  of  a  tuple  In  bits 

k  =  number  of  subcells/cell  (k>3  because 
of  the  data  move  strategy  incorpo¬ 
rated) 

T.  r  =  time  to  load  (store)  a  tuple  via 

LS  DMA  -  TBIT*HJPLEN 

Twl  =  available  time  to  process  tuple  1 

NT  *  =  number  of  tuples  In  CM 

rTr,TA,  s  total  circulation  time  of  CM  from 
the  start  of  loading  of  the  first 
tuple  to  the  end  of  storing  of  the 
last  tuple. 

l„rt;T  =  extra  time  needed  to  restore  the  last 
(k-1)  tuples.  (It  should  be  noted 
that  CM  circulation  Is  completed  only 
after  the  last  tuple  Is  restored. 

Some  extra  time  Is  needed  to  restore 
the  last  (k-1)  tuples  because  the 
total  dynamic  capacity  of  the  cell 
memory  Is  equal  to  the  CM  capacity 


Assume  that  all  tuples  require  exactly  L  times 
the  time  allowed  by  the  architecture  1 : 

TREQ  -  L*(k-2)*TL$  i  (L  >  1) 

Assuming  also  that  mod  (NT,k)  ■  0,  then 
during  the  circulation,  NT/k  tuples  will  be 
processed  by  each  subcell.  The  time  to  handle  a 
tuple  Is: 

TTUPLE  “  TREQ  +  Z*TLS 

where  the  last  term  accounts  for  the  load  and  store 
times. 

Since  processing  of  the  tuples  are  overlapped 
over  the  k  subcells,  the  total  time  for  a 
circulation  will  be: 

TTOTAL-  (NT/k)*T1UpLE+(k-l)*TLs 
+  (l-l)*(k-2)*TLS 

where  the  first  term  Is  the  time  to  process  NT 
tuples  with  k  subcells  In  parallel,  the  second 
term  is  the  time  to  restore  the  (k-1)  tuples  at 
the  end  of  the  circulation  and  the  third  term  Is 
the  Initial  extra  time  (beyond  the  allocated  time) 
required  by  subcell,  for  tuple,.  Inserting 
T TUPLE  9*ves  on  upper  Pound  for  the  circulation 

time  when  each  tuple  requires  L  times  the 
allocated  time,  as: 

T TOTAL  ’  ((NT/k)*(2  +  L*(k-2)) 

+  L*(k-2)  +  l)*TL$ 
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Anuemlix  ? 

Analysis  of  the  Join  One  rat  ion  rei^ormnn;  *■ 

The  time  to  process  one  source  ivli'n-n  « ** I ; 
contents  can  be  approximated  as: 

T  T  ,  |NS|  V(T 

'JOIN  “  tBUF  +  1  n  1  'TOTAL 

where  the  first  term  Is  the  time  to  read  and  buffer 
the  source  cell  values  (1  CM  circulation)  and  the 
second  term  Is  the  time  to  process  the  NS  buffered 
source  values  and  represents  [NS/n]  circulations 
(l.e. ,  n  values  are  passed  in  each  circulation) 
of  the  target  relation  cells. 

If  n,  is  the  number  of  source  values  that  can 
be  processed  In  one  target  relation  cell  memory 
circulation  without  Imposing  any  waits  (L  *  1),  then 
the  total  circulation  time  for  this  case  will  be 
(ref.  Appendix  1): 

TTOTAL, nowait  *  (NT  +  k-1)*TLS 

while  the  total  number  of  such  circulations  will  be 
f  NS/n ,1  . 


Nog U*(  t  inq  'ho  terms  k-1  and  m(k-2)+l  in  the 
:.isi  two  equation',,  which  are  much  less  than  the 
cnri-espondinq  terms  we  can  write: 

’ini Al._wa.it  _  .  (NT/M(?  +  n.(fc-2)) 

'TOTAL ,  nowait  NT 

?+m(k-2) 

i - <  m  for  m  >  1 

k 

It,  can  be  observed  that  imposing  waits  on  the 
CM  by  feeding  in  more  source  values  per  circulate 
reduce  the  number  of  circulations  by  a  factor  of 
1/m  while  the  increase  in  the  total  time  of  each 
such  circulation  is  less  than  m-fold,  hence  the 
overall  time  to  process  the  buffered  source  values 
is  reduced. 

The  actual  execution  time  in  reality  will  be 
much  less  than  the  above  derived  bounds  because 
of  the  fact  that  after  each  target  relation  scan, 
the  number  of  target  values  not  yet  selected  and 
hence  will  impose  waits,  will  diminish  at  an 
increasing  rate  until  the  last  target  relation 
scan. 


If  m*n,  source  values  are  processed  in  one 
target  relatiin  cell  memory  circulation,  then  L 
will  be  roughly  m,  and  the  total  circulation  time 
will  be: 

'total,  wait  ■  (("/»>(?+■(»•*)> 

+  m  (k-?)  ,  1)  *Tls 

while  the  total  number  of  such  circulations  will 
be  fNS/(m*n1)|- 


The  current  design  employs  four  subcells 
(k«4)  and  assumes  that  CM  shifts  at  300  kHz  giving 
a  TB1T  of  208  nsec/bit;  then  for  i  Kbit  target 


relation  tuples,  the  allocated  time  Is  426  psecs. 
Within  this  time,  the  INTEL  8086  routine 
developed  to  perform  equl-joln  on  2  byte  numeric 
domains  can  process  100  source  values  without 
imposing  any  waits.  Processing  400  such  source 
values  gives  L*4.  Since  a  target  relation  value 
may  qualify  for  the  join  before  the  whole  source 


QUERYRTN 

LEA 

BP, MASKD 

/  check  if  tuple  is  deleted 

CALL 

MKED 

/  previously; 

JB 

NOTQUAL 

/  exit  if  deleted; 

LEA 

BP .MAKT4 

/  check  if  tuple  is  T4  marked 

CALL 

MKED 

/  previously; 

'  MKED  (T4) 

? 

JNB 

NOTQUAL 

/  exit  if  not  14  marked; 

LEA 

BP.PB1 

/  set  pointer  to  parameter  block  1; 

CALL 

C0MPNUM2 

/  call  numeric  comparison  routine; 

SALARY  2000 

? 

JNB 

NOTQUAL 

/  exit  if  comparison  fails; 

LEA 

BP.PB2 

/  set  pointer  to  parameter  block  2; 

CALL 

COMPLITR 

/  call  literal  comparison  routine; 

DEPT  .  'SHOE' 

? 

JNB 

NOTQUAL 

/  exit  if  comparison  fails; 

LEA 

BP.PB3 

/  set  pointer  to  parameter  block  3; 

&nn  gnn  m  sai  arv 

CALL 

ADD2 

/  tuple  Is  qualified,  update  it 

MWW  VWV  >  V  W  ■ 

NOTQUAL 

JMP 

WAIT 

/  and  wait  until  next  tuple; 

MASKD 

DC 

X 1 8000 1 

/  mask  for  deleted  tuples; 

MASKT4 

DC 

X 1 0800 1 

/  T4  marked  mask; 

PB1 

DC 

A(TUPLE+SALARY)  /  address  of  SALARY  domain  in  buffer; 

DC 

H  *  ZOO ' 

/  external  comparand; 

DC 

H  ‘  4 1 

/  comparison  mode  for  "Greater  than"; 

PB2 

:  DC 

A(  TU  PLE+OEPT ) 

/  address  of  DEPT  domain  in  buffer; 

DC 

H '  8  ‘ 

/  length  of  the  domain; 

DC 

H  ’  2  ’ 

/  comparison  mode  for  "equal  to"; 

DC 

C ' SHOE ' 

/  external  comparand; 

PB3 

:  DC 

A(TUPLE+SAIARY) 

/ 

DC 

H 1 500 1 

/  external  value  to  be  added. 

Figure-7  Intel  8086  Program  for  a  PAP  Instruction 
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value  block  is  scanned,  the  actual  average  total 
circulation  time  will  be  considerably  less  than 
what  was  found  in  the  above  analysis  for  the  worst 
case  assumptions. 

Appendix  3 

Uuery  Routine  Example 


SUM  :  Selects 

COUNT  :  Selects 

MAX  :  Selects 

MIN  :  Selects 

AVERAGE  :  Selects 


and  accumulates 
and  counts 

and  finds  the  maximum 
and  finds  the  minimum 
and  computes  average 


Insertion  and  deletion  commands:  Insert  and  delete 
record  occurrences. 


Consider  the  RAP  Instruction: 

ADD  [EMP  (SALARY):  MKED  (T4)  & 

SALARY  >2000  6. DEPT  .  'SHOE']  [500] 


DELETE  :  Selects  and  deletes  record 

occurrences  from  the  record  type 
INSERT  :  Inserts  record  occurrences  Into 

the  record  type 


which  adds  600  to  the  salaries  of  those  employees 
which  satisfy  the  accompanying  qualification 
expression.  It  is  assumed  that  SALARY  Is  a  2  byte 
numeric  domain  and  DEPT  is  an  8  byte  literal 
domain. 


The  query  routine  for  this  RAP  instruction, 
in  INTEL  (1086  instruction  set,  can  be  given  as  in 

f  i  (11111*-  /. 

The  routines  MKED,  C0MPNUM2 ,  COMPLITR  and 
AD02  reside  in  subcell  ROM  and  perform  mark  status 
tests,  value  comparisons  and  addition  updates  on 
the  domains  of  the  tuples  according  to  the 
Information  supplied  with  the  associated  parameter 
blocks.  Since  the  data  qualification  evaluation 
and  the  update  are  done  together,  this  instruction 
would  take  only  one  cell  memory  circulation  to 
process  all  the  tuples  of  a  relation. 

It  should  be  noted  that,  it  is  possible  to 
construct  query  routines  for  data  qualifications 
and/or  updates  of  any  complexity. 

Appendix  4 

Summary  of  the  instruction  set  of  the  HAP 
HOMS  "Assembler  language 

Selection  and  retrieval  commands:  Implement 
selection  and/or  data  retrieval. 


MARK 

RESET 

HEAD 

CROSSJIARK 
t.RS  C0ND_MARK 
(11.  F..F!RST_MARK 

GET  FIRST 
SAVE 


:  Selects  and  tags 
:  Selects  and  removes  tags 
:  Selects  and  reads 
:  Maps  between  two  record  types 
:  Maps  between  two  record  types 
:  Cursor  and  napping  within  a 
record  type 
:  Cursor 

:  Selects  and  saves  item  In  RAP 
register 


Data  definition  commands:  Initialize,  populate, 
and  delete  a  record  type. 

RELATION  :  Defines  a  new  relation  (record 
type).  Size,  type,  length 
oarameters  for  the  data  are 
declared.  (Key  attributes  and 
access  paths  are  defined  if  the 
sofware  emulator  rather  than  the 
actual  machine  is  used).  User 
capabilities,  access  rights,  and 
the  protection  parameters  are  also 
,  declared  with  the  use  of  this 

command. 

CREATE  :  Populates  the  database  for  the 

specific  record  types  which  have  been 
defined  by  the  RELATION  command. 
DESTROY  ;  Deletes  a  record  type 

System  commands: 


AUTHORIZE 

LOCK 

RELEASE 
SAVE  MARKS 

RE$TORE_MARKS 

LOCATE 

MOVE 

STATUS 
READ  MARKS 


Grants  access  to  the  user  via  a 
password 

Specified  record  types  are  locked 
against  concurrent  accesses 
Releases  locks 

Current  mark  bits  of  specified 
relations  are  pushed  onto  stacks  of 
each  tuple 

Restores  marks  by  poping  the 
saved  mark  bits 

Returns  the  node  address  of  the 
relation  being  searched 
Moves  an  entire  or  restricted 
subset  of  a  relation  to  the 
specified  site 

Performs  dynamic  status  checking 
for  branching  purposes 
Same  as  READ,  but  output  includes 
also  mark  bits 


Register  manipulation  commands: 


Update  commands: Perform  selection  and  in-place 
arithmetic  and  replacement  updates. 


ADD 

SUB 

Mill. 

U1V 

REPLACE 


Iteml  +  Iteml  +  Item2 
Iteml  *  Iteml  -  Item2 
Iteml «-  Iteml  *  Item2 
Iteml  4-  Iteml  /  Item2 
I  teml  4- 1  tem2 


I  or  constant) 
or  constant) 
or  constant) 
or  constant) 


Statistical  (Set  function)  coirmands:  Select  and 
compute  functions  in-place. 


READ  REG 
STORE  REG 
DEC  REC 


Reads  out  RAP  registers 
Enters  data  into  user  registers 


:  Decrements  specified  register 
contents  by  one 
INC  REC  :  Increments  specified  register 

contents  by  one 

RADD.RSUB.RMUL.RDIV:  Perform  specified  arithmetic 
operations  on  registers  as: 
<reg>*  <reg><roprxoperand>  where 
ropr  is  one  of  RADD.RSUB.RMUL ,  or 
RDIV. 
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Decision  and  transfer  commands:  Control  program 

loops . 

TEST  :  Tests  presence  of  tags  within  a 

record  type 

BC  :  Branch,  conditional  and  uncondi¬ 

tional 

EOQ  :  End-of-query 
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ABSTRACT 

A  direct-execution  model,  based  on  the  tree- 
structured  Internal  representation  of  the  source- 
texts  has  been  defined.  It  features  a  single  Inter¬ 
mediate  environment  and  two  environment  transfers  : 
die  first  one  corresponds  to  a  bidirectional  trans¬ 
lation  between  the  source-text  and  the  tree-struc¬ 
tured  internal  form.  The  second  one  Is  a  conven¬ 
tional  microprogrammed  Interpretative  process  on  a 
specialised  hardware  architecture. 

In  this  paper,  a  full  description  of  a  hardware 
arcnl Lecture  which  directly  holds  the  tree-structu¬ 
red  forms  is  given.  Its  characteristic  features  are 
discussed  and  the  micro-control  operations  which 
deal  with  the  main  tree-structured  form  concepts 
(rucurslvlty,  top-down  tree  traversing,  escapes) 
are  presented. 


1  -  INTRODUCTION 

To  solve  the  problems  resulting  from  the  seman¬ 
tic  gap,  wliic.il  arise  In  the  conventional  computer 
systems,  new  computer  architectures  have  been  revea¬ 
led  these  last  few  years.  Their  purpose  Is  to  sup¬ 
port  directly  one  or  more  high  level  languages.  In 
hardware.  In  this  way,  eliminating  the  order-codes 
tends  to  close  the  gap  between  the  high  level  lan¬ 
guage  and  the  physical  structure  of  tne  host  machine. 

Although  the  Von  Neumann  architecture  Is  Increa¬ 
singly  and  rightly  questioned  none  of  the  proposed 
systems  of  high  level  language  processors  have  been 
traded  successfully.  We  tried  to  analyse  the  reasons 
of  these  failures****  and  It  appears  that  the  attrac¬ 
tiveness  of  the  Von  Neumann  architecture  resides  in 
Its  conceptual  simplicity,  whereas  the  suggested 
solut1onsJ>4i5  are  characterized  by  complex  models, 
difficult  to  understand  and  to  Implement,  and  often 
leading  to  gas-works  architectures. 

Therefore,  we  nave  proposed  a  direct  execution 
scheme,  based  upon  the  definition  of  a  class  of  list- 
structured  directly  Executable  Languages  (DELs), 
which  Is  derived  from  LISP®.  The  objective  of  this 
scnemu  Is  to  provide  the  implementation  of  high  level 
languages  with  a  systematic  support,  easy  to  under¬ 
stand,  and  to  use'. 


1.1.  The  3L -miodol 

A  direct  execution  scheme  with  a  single  level 
was  defined  i.e,  a  scheme  Including  only  one  Inter¬ 
mediate  environment  between  the  source-text  and  the 
uxecutionai  environment  (flg.l), 


Fig.l  -  The  3L-tnodel 


.  A  first  interactive  processor,  the  editor,  Is  res¬ 
ponsible  for  the  communication  between  the  external 
environment  (source-text)  and  the  Internal  environ¬ 
ment  (DEL). 

.  A  second  processor,  the  Interpreter,  Is  responsi¬ 
ble  for  the  evaluation  of  the  Internal  form  through 
the  hardware  operators. 

.  The  3L-mach1ne  (M3L)  Is  the  physical  support  of  the 
3L -model.  Both  processors  are  microprogrammed  on 
M3L,  with  a  high  level  microprogramming  language, 
specialized  In  the  expression  of  the  emulation  pro¬ 
cessing  :  the  Language  for  Emulation  (LEM). 

1.2.  The  3L-fonn 

The  choice  of  the  Intermediate  environment  deter 

minus  the  direct  execution  scheme.  As  we  wished  to 
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im  mu  in  the  whole  semantics  of  the  source-text 
wnile  providing  the  interpreter  with  an  easy  form 
to  handle,  we  chose  a  list-structured  internal  form, 
based  upon  LISP  :  the  LISP-Like-Languages  (3L).  The 
3L  form  is  prefixed  anH  fulTy  parenthetized. Although 
its  semantic  power  is  very  high,  its  syntax  is  abso¬ 
lutely  trivial  and  it  offers  a  great  systematization 
for  the  internal  representation  of  the  programs, 

The  3L  form  is  represented  within  the  memory 
by  a  binary  tree-structured  form.  This  form  is  tag¬ 
ged,  its  unit  is  the  pair-cell  : 

_ 16  16  a 

CAR _ COR  UES 

the  CAR  field  generally  represents  a  left  pointer, 
the  CUK  field  a  right  pointer,  and  the  dES  field  gi¬ 
ves  the  description  of  the  cell  content,  more  pre¬ 
cisely  for  the  representation  of  objects. 

Example  :  Suppose  that  in  the  high  level  language 
wenave  the  operation  f(x,g(y)).  It  can  be  expres¬ 
sed  in  the  terms  of  the  symbolic  3L  form  as 
(fx(gy)),  and  within  the  pair-cell  memory  : 


STACK  MEMORY 


U  :  node 


MICROINSTRUCTION 


EXECUTIVE  MEMORY 


MICROPROGRAMS 

MEMORY 


GENERAL  BUS 


mam 


PAIR-CELLS  MEMORY 


!■  i -  ARCHITECTURE  OP  M3l. 


2  -  THE  GENERAL  STRUCTURE  OF  M3L 

Iht'  M3L  project  started  with  a  systematic  study 
ui'  thu  interpretation  of  LISP.  First,  we  defined  a 
pseudo-machine,  then  we  wrote  a  simulator,  and  de¬ 
veloped  a  microprogrammed  LISP  interpreter  upon  it. 
The  simulation  measures*  opened  up  on  a  new  archi¬ 
tecture,  wnich  was  defined  for  the  M3L  prototype, 
presently  in  the  achievement  phase. 

2,1,  Synoptic  of  M3L 

The  general  organisation  of  the  3L-machine  is 
wry  simple,  The  resources  are  interconnected  via  a 
single  bus  which  determines  the  datapath,  The  data- 
patn  is  lb. bit  wide,  being  the  maximal  size  of  the 
prototype  pair-cells  memory,  (fig. 2) 

In  tnu  3L-machine  there  are  four  categories  of 
registers  : 


.  The  arithmetical  and  logical  unit  (ALU) 

The  ALtl  of  M3L  is  built  from  four  AM  2903  LSI 
chips.  Owing  to  the  use  of  an  arithmetical  proces¬ 
sor,  its  task  is  very  small  :  it  has  to  manaoe  the 
A^  registers,  and  it  performs  data  comparisons  which 
are  typical  tasks  of  the  environment  transfers. 

.  Inputs/outputs 

The  Inputs'/outputs  system  is  built  from  a  8 
bit  wide  peripheral  minibus  on  which  the  Interface 
adaptators  for  asynchronal  comnuni cations  are  con¬ 
nected.  These  chips  perform  the  standard  control 
functions  according  to  the  CCITT  V24  Standard.  The 
minimal  version  of  M3L  includes  an  ACIA  for  driving 
the  TTY,  and  another  for  interacting  with  a  micro¬ 
system,  responsible  for  the  management  of  Inputs/ 
outputs  and  disk-files. 


Aj  registers  ItlO.lSj 

they  are  used  for  current  works  and  Information 
transfers  between  microprocedures 

Bj  registers  1e|)0,2553 

they  serve  as  global  registers  for  every  micro¬ 
procedure,  they  contain  the  descriptors  of  the 
current  emulated  system 

T,  registers  1cL0,31!l 

they  are  flip-flops  which  give  the  status  of  the 
system.  They  are  global  resources  and  some  of 
them  can  be  set  or  reset  by  the  programmer 

l< ,  registers  tei;0,3!) 

they  make  the  recursivity  In  LEM  possible  by  tne 
use  of  their  locality. 

<',2,  The  numerical  processing 


In  the  Von  Neumann  architecture,  the  numerical 
processing  Is  prevalent.  It  is  represented  by  the 
central  operator  and  the  inputs/outputs.  More  and 
more,  it  is  Integrated,  especially  In  the  microsys¬ 
tems.  On  the  contrary,  in  a  high  level  language 
processor  the  non-numerical  processing  is  prevalent, 
if.  is  true  for  M3L  where  the  architecture  is  desi¬ 
gned  according  to  the  emulation  processing.  Of  course 
it  is  yuc  necessary  to  Incorporate  the  elements  of 
the  numerical  processing  within  this  architecture. 
Nevertheless,  they  take  a  marginal  place  in  M3L  and 
they  are  entirely  supported  by  a  single  LSI  family 
(AMO  2900), 

.  Tne  arithmetical  processor 
'  "‘"Most  of  the  arithmetical  functions  of  the  3L 
machine  are  performed  by  a  monolithic  processor 
(AM  9511).  This  processor  relieves  the  machine  of 
all  the  corresponding  micro-software  of  mean  impor¬ 
tance  for  emulation.  It  can  be  viewed  as  a  periphe¬ 
ral  of  M3L ,  It  runs  in  parallel,  and  it  is  inter¬ 
faced  by  the  general  bus,  The  main  operations  per¬ 
formed  by  tne  AM  9511  are  : 

-  18  data  manipulation  operations  :  conversions 
fixed-float,  read,  write,  ... 

-  5  fixed  arithmetical  operations  (IGand 32 bits) 

-  4  float  arithmetical  operations  (32  bits)  : 

■*»-,*,/ 

-  11  secondary  operations  (32  bits  float)  ; 

sin  ,  cos,  xk  ,  ... 
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Fig. 3  -  THE  PERIPHERAL  MINIBUS 


3  -  THE  MICRQCONTROL 


Microprograms,  written  in  LEM,  are  compiled  to 
produce  fixed  microcode.  Vertical  microprogramming 
used  for  this  Implementation  results  in  two  advan¬ 
tages  :  the  effort  of  the  compiler  is  less  important 
and  the  site  of  microinstructions  can  be  shortened. 
This  reduces  the  amount  of  microcode  to  swap  during 
control  switches. 

The  great  diversity  of  control  signals  to  pro¬ 
vide  (in  particular,  to  control  the  tri-state  bus) 
nas  led  to  a  two  leveled  microprogramming.  The  method 
used  here  is  different  from  the  nanoprogramming  of 
QH.l**  which  uses  a  second  level  of  microprogramming. 
To  execute  a  microinstruction  through  the  datapath 
one  must  : 

1.  provide  some  parameters  : 

-  number  of  Ai  ,  ,  R^  ... 

-  long,  short  constant 

-  numoer  code  of  branch  operation,  of 
ALU  function  . . . 

2,  define  an  action  to  execute,  i.e.  to  state  a 

particular  data  transfer  through  the  datapath. 

The  second  part,  fixed  for  a  given  action, still 
reguires  much  more  bits  for  the  direct  control  of 
gates.  The  repetition  of  such  a  long  "dead-bit"  se¬ 
quence  is  cumbersome,  thus,  the  action  to  be  execu¬ 
ted  is  specified  by  the  second  level  of  micropro¬ 
gramming,  in  a  single  horizontal  word  where  each 
control  bit  drives  directly  the  gates  :  it  is  the 
executive. 

The  format  of  a  fix-sized  microinstruction  is 
then  : 


OPu  represents  the  code  number  of  an  executive,  and 
the  Pi's  are  the  arguments. 

The  size  of  the  microinstructions  is  32  bits. 
To  tne  operation  code  (opc)  can  correspond  up  to 
256  executives.  Theoretically,  a  great  number  of 
executives  can  be  defined  but  practically  the  faci¬ 
lities  of  a  datapath  are  never  completely  put  on 
use  :  our  simulation  of  a  LISP  systeml  required  60 
executives  only.  The  executives  reside  in  a  fast 
PROM  memory  (tft»35  ns)  with  256  words  of  116-bit 
length. 

3.2.  Description  of  the  microcontrol  word s 

.  The  microinstruction  parameters 

There  are  10  available  pi  parameters.  A  micro¬ 
instruction  is  an  assembling  of  some  of  these  para- 
meters.The  assemoly  rules  ire  stated  hy  each  para¬ 
meter  place  within  the  24-bit  parameter  field. 


i 


The  typical  formats  are  : 

23  7  0 


Three  different  places  are  available  for 
the  A}  registers 


Four  different  places  are  available  for 
the  Ri  registers 


23  15 _ a 

□ 

Bi 

CO 

/t 
/, ' 

- J- , 

• 

B^  registers 


The  T i  registers  are  associated  with  the 
BRanch  field 


ISC  is  the  escape  tag  and  IND  specifies  the  stop 
mode  for  the  return  on  escape  condition  :<,=,> 


•  li'c  executive  word 

The"  executive  is  divided  into  14  sub-fields  which 
can  be,  or  not,  attached  to  a  particular  control 
task  upon  the  datapath.  The  size  of  the  following 
sub-fields  is  illustrated  below. 


field  name 

CONTROL  OF 

uPC 

microprogram  counter 

W.S 

MPX  (Shift  and  Mask) 

ST  K 

Stack  memory 

HSLL 

Memories  selection 

ALU 

AMO  2903  ALU 

SA,13,: 

source  selection  for  the  general 
bus  transfers 

I!A,B,L,l) 

receptor  selection  for  the  general 
Pus  transfers 

(M.C.D  specify  the  four  different  transfer  inodes, 
via  the  general  bus) 


3.3.  operating 

The  cycle  time  of  the  M3L  microinstructions  is 
fixed  to  500  ns.  It  may  seem  to  be  long  for  a  modern 
technology  but  with  regard  to  the  power  of  microins¬ 
tructions  it  is  a  good  speed.  The  cycle  starts  with 
the  fetch  of  tne  microinstruction  (100  ns),  it  in¬ 
cludes  some  register  moves,  and  always  a  main  con¬ 
trol  phase  which  is  200  ns  long. 

As  the  case  may  be,  this  phase  performs  : 

an  access  to  the  pair-cells  memory 

-  an  arithmetical  operation  on  the  ALU 

-  a  context  switch  with  an  access  to  the  stack  memory 
••  a  refresh  cycle. 


A  suspension  is  a  request  for  a  temporary  halt 
of  the  current  microprogram.  During  this  halt  a  sin¬ 
gle  microinstruction  is  performed.  The  suspension 
takes  place  when  the  latch  is  loaded  :  an  encoder 
detects  the  suspension  and  yields  its  number.  As 
there  are  8  different  suspensions,  the  8  first  exe¬ 
cutives  will  therefore  be  regarded  as  suspension 
handlers. 


One  of  these  suspensions  will  be  the  refresh 
request  for  the  dynamic  MOS  memory.  The  aim  of  this 
suspension  is  to  perform  a  refresh  cycle  without 
modifying  the  current  context. 


3.5,  Interrupts  and  microinstruction  tracing 


Another  suspension  will  be  associated  with  the 
interrupt  request.  It  has  to  save  the  current  con¬ 
text  without  changing  the  microprogram  counter 
(pPC),  also  it  has  to  branch  to  the  Interrupt  hand¬ 
ler.  At  the  hardware  level,  the  management  of  in¬ 
terrupts  is  achieved  with  the  help  of  two  interrupts 
controlers  (AH  2914)  which  allow  the  handling  of 
16  interrupts  levels: 


To  each  microinstruction  word,  a  tracing  byte 
is  concatenated ,  where  each  bit  is  associated  with  a 
microsoftware  interrupt.  The  bits  are  setted  at  the 
compiling  stage.  Thus,  when  running,  they  activate 
the  corresponding  tr j  Interrupts  which  then  are  held 
sequentially,  according  to  their  priority  level. 

They  can  be  enabled  or  disabled  in  software. They 
are  used  in  microprograms  debugging  and  for  the  M3L 
prototype  measurement. 

4  -  THE  PAIR-CELLS  MEMORY 

The  pair-cells  memory  is  the  main  resource  of 
H3L,  It  is  built  with  dynamic  MOS  memory.  Each  chip 
contains  16  k  -  1  bits  and  its  access  time  is  150  ns. 
The  pair-cells  memory  is  organized  in  40-bit  wide 
words  which  are  divided  in  three  fields  having  each 
one  16  ,  16  and  8  bits,  ' 


Fig. 4  -  THE  TWO-LEVELEO  MICROPROGRAMMING 
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Data  moving  in  the  write  mode 


B  P  M  *-HSK 
(WRITE) 


BPM  «-HSK 
(READ)  *-SHT 


THE  PAIR-CELLS  MEMORY 


fne  access  to  the  pair-cells  memory ,  in  the 
read/write  mode,  It  done  through  the  general  bus. 
With  respect  to  data  moving  there  are  two  kinds  of 
access  in  the  raad  mode  and  one  in  the  write  mode. 
As  for  control  there  are  three  kinds  of  access. 

4.1.  Access  to  the  pointer  field 

•  Data  moving  in  the  read  mode 
.  slnglf  trailer  :  The  LEM  syntax  is 


The  first  register  contains  the  address  of  the 
memory  cell  to  be  modified,  and  the  second  one  con¬ 
tains  the  information  to  be  moved. 


.  Access  in  the  control  mode 


Dffiks 


ACTIONS 

LIST 


The  register  contains  the  address  of  the 
referenced  to  memory  word.  After  reading,  the  con¬ 
tent  of  the  corresponding  Fi  field  is  added  to  the 
microinstruction  address  register.  In  most  situa¬ 
tions,  the  access  in  the  control  mode  concerns  the 
descriptor  field  of  the  memory  cell.  Hence,  this 
multiple  branch  operation  enables  the  3L  form  to  be 
decoded.  More  details  on  this  microinstruction  are 
given  in  the  section  6.3. 

4.2.  Access  to  the  descriptor  field 

Whereas  the  access  to  the  pointer  fields  (FO, 

FI)  is  fixed,  the  access  to  the  descriptor  field 
(F2)  is  more  versatile.  As  a  matter  of  fact,  for  a 
given  emulated  system,,  this  Held  can  arbitrarily 
be  divided  into  contiguous,  or  superposed  sub-fieldr. 
These  sub-fields  can  be  accessed  to  In  the  read/ 
write,  or  control  mode,  like  the  pointer  fields. 

Ensuring  the  access  to  a  sub-field  of  DES  needs 
a  special  device  to  select  the  field.  This  device 
was  discussed  in  a  more  general  situation?.  Here  It 
is  applied  to  a  byte  only,  thui.1t  is  very  simple. 
There  Is  a  mechanism  for  the, reading  operation, and 
another  mechanism,  strictly  tMHftflC,  for  the  writ- 
ting  operation,.  Therefore  me  {n#L  ‘only  describe  the 
fetch  mechanism. 


DESCRIPTOR  MEMORY 


the  first  register  specifies  tnereceiver  and  the 
second  one  contains  the  address  of  the  eraetter. 

Example  :  A2  *  F1(R3)  means  : 

"read  the  FI  field  of  the  pair-cell,  which  address 
is  stated  in  the  R3  register,  and  store  it  into 
the  A2  register" 

when  compiled  it  yields  the  following  microinstruc- 

ti0fLi - su, — zjs _ u. —  -S _ 

PC-READl  f/yc  ■  Rj  A1  |  Ti  7  BR 

•  ^9y&iS_tCSDSf'SC 

Ceeich v| r{  -* 

Tne  first  register  contains  cne  address  of  the  emet- 
ter  whereas  the  second  and  the  third  registers  deal 
with  the  receivers  of  the  fields  FO  and  FI. 

fetcn  A1  jnto  A2  and  R3  is  equal  to  J  **  * 

It  yields  : 

Jj _ 23  20  17  IB  II  5 

PC-REA02  I  A.  I  IrJI  Aj  Ti  H  BR 


GENERAL  BUS 

¥im 

Fig . fa  -  PRINCIPLE  OF  THE  DES  ACCESSING  MECHANISM 


A  first  logical  level,  HUT,  performs  a  circular 
shift  on  the  descriptor  byte.  This  shift  is  perfor¬ 
med  in  a  purely  combinatory  an  parallel  manner  by  a 
special  chip  ($GN  0243).  A  second  logical  level  masks 


the  Irrelevant  part  of  the  descriptor  byte.  The  se¬ 
lection  of  a  field  requires  the  specification  of  a 
shift  (0-7)  and  a  Mask  (e  byte).  These  Informations 
are  included  Into  the  executive  of  the  microinstruc¬ 
tion  wnich  fetches  the  sub-field. 


Pi  takes  Its  input  arguments  Into  the  Ai  regis¬ 
ters  and  outputs  Its  results  to  ?z  via  the  M's. The 
object  of  the  Ri  registers  Is  to  maintain  the  value 
of  \\  registers  in  the  environment  of  Pj,  this  value 
does  not  have  to  be  erased  by  the  application  of  HZ- 


Example  : 


DESCRIPTOR  SUB-FIELDS 


MASK 

#  FF 

#  07 

#  OF 

#  01 


SHIFT 

0 


0 


4 


3 


Tne  combinatory  nature  of  the  select  mechanism 
of  the  descriptor  sub-fields  enables  the  M3L  "me¬ 
mory  word"  to  be  viewed  as  e  sequence  of  fields 
Fl-O.n  •  wh^h  are  equally  accessible  In  the  raad, 
write,  or  control  mode.  In  a  single  microinstruc¬ 
tion  cycle.  This  emphasises  the  thorough  attention 
which  was  paid  to  the  access  to  the  Intermediate 
environment  on  M3L. 


S  -  THE  CONTROL  UNIT 

Beyond  the  special  organization  of  the  main  me¬ 
mory,  the  second  feature  of  the  M3L  architecture 
concerns  Its  control  unit.  As  a  matter  of  fact,  It 
has  to  support  the  recursivlty  mechanism  which  Is  a 
fundamental  aspect  of  the  emulation  functions.  The 
LEM  language  Is  recursive  and  this  Is  conveyed 
through  the  hardware  structure  at  the  level  of  the 
control  unit  of  M3L. 

A  LEM  module  Is  composed  of  little  procedures 
which  are  Independent  and  not  ordered,  They  can 
refer  to  each  other  and  even  to  themselves.  In  con¬ 
trol  switching  from  a  mlcro^rocedure  to  another,  A< 
global  registers  are  used  for  parameter  passing  ana 
Ri  local  registers  are  automatically  saved. 


To  the  recursivlty  an  automatic  escape  mechanism 
is  added.  Writting  the  top/down  recursive  parsers 
requires  such  devices  wich  are  similar  to  software 
interrupts  (like  "ON  conditions"  of  PL/1). 

An  escape  microinstruction  performs  a  return 
operation  to  the  last  call  microinstruction  which 
has  set,  in  the  recursivlty  stack,  a  tag  number 
(Ei  constant)  equal  to  the  tag  number  of  the  escape 
microinstruction.  Escapes  and  recursivlty  are  two 
concepts  which  are  closely  related,  hence  they  have 
been  merged  in  order  to  offer  a  better  systematiza¬ 
tion  of  the  control  transfer  between  microprocedures 
It  Is  thus  stated  that,  In  LEM,  calls  are  recursive 
and  returns  are  escapes. 


The  control  unit  Is  illustrated  In  the  fig. 7. 

Its  main  components  are  : 

ESC  stack  :  enables  the  escape  number  to  be  saved 
when  a  recursive  call  occurs 

COMP  :  when  an  escape  microinstruction  Is  per¬ 
formed,  It  Indicates  If  the  escape  num¬ 
ber,  given  as  an  argument,  corresponds 
to  the  escape  number,  that  is  read  Into 
the  ESC  stack 

MPX3  :  32x1  multiplexer.  The  selection  Is  made 
according  to  the  T-|  nuqbpr,  passed  as 
an  argument,  meanwhile  T  allows  the 
output  to  be,  or  not,  Inverted 

uPC  stack  :  Is  the  saving  stack  for  the  microins¬ 
truction  address  register 

ADDER  :  Is  a  simple  adder  to  perform  relative 
branches 

MPXl.MPXZ  :  are  the  Input  multiplexers  of  the  micro¬ 
instruction  address  register 

uPC  :  Is  the  microinstruction  address  register 

pPC  control  :  produces  the  control  signals  which 

correspond  to  the  operation  code  of  the 
current  microinstruction. 


The  control  unit  microinstructions 

The  Five  basic  microinstructions  dealing  with 
the  sequencing  of  the  microprograms  are  :  tne  con¬ 
tinuation,  the  conditional  branch,  the  multiple 
branch,  the  recursive  call  and  the  escape. 

1.  Continuation 


CONTINUATION 


Subroutine 
address  (SB) 


£1 


uPC  *  SB 


Owing  to  the  continuation  microinstruction,  It  Is 
possible  to  perform  branches  between  the  micropro- 
cedurcs  without  any  push  operation. 
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Fig. 7  -  THE  CONTROL  UNIT 


(*)  Tne  storage  of  the  »PC  Into  the  stack  Is  performed  after  the  Incrementation  and  before  the  load 

Table  1  -  THE  MICROINSTRUCTIONS  OF  THE  CONTROL  UNIT 


2.  Conditional  branch 


if  (T»  1  and  T.=  l)  or  (T«0  and  Ti  =  0) 


then  oPC  -  pPC  +  BR  +  1 
else  yPC  pPC  +  1 

The  deplacement  BR  is  signed.  The  signe  bit  is  in 
tiio  most  significant  position. 

3.  Multiple  branch 


pPC  v  yPt;+  1+  F-i  (Aj )  (la  2) 

This  microinstruction  enables  a  decoding  starting 
from  a  sub-field  (Fj)  of  the  descriptor. 


4.  Ca_U 


v 

*  iiPC  <■  pPC  +  1 

*  the  current  context  is  saved  into  the  stack 

Ri  stack  Ri *0,3 

ESC  stack  4-  j  SC 

uPC  stack  4-  ypc 

*  nPC  4.  SB 

V 

5.  Escape 


v 

*  Tne  context  Is  popped  from  the  stack  : 

Ri»0,3  *  Ri  stack 

*  if  Ei  =■  ESC  stack  then  uPC  4-  |iPC  stack 

v  -  ~ 

The  escape  microinstruction  is  executed  as  many  ti¬ 
mes  as  necessary  until  finding  an  escape  number  cor¬ 
responding  to  that,  specified  in  the  Ei  field.  It 
scans  the  control  unit  stack  in  search  of  its  cor¬ 
responding  context.  Hence,  it  generalizes  the  return 
mechanism, 


CONCLUSIONS 

The  first  remark  that  we  can  make  about  the 
MJL  architecture  is  related  to  the  numerical  proces¬ 
sing  ;  it  is  not  absent,  since  without  it  there 
would  not  be  any  execution,  but  it  takes  a  seconda¬ 
ry  place.  This  does  not  imply  that  M3L  is  not  able 
to  perform  efficiently  this  kind  of  processing.  On 
tne  contrary,  owing  to  the  advanced  integration 
capabilities,  a  LSI  family  ensures,  alone,  the  func¬ 
tions  of  the  conventional  architecture  very  effi¬ 
ciently. 


Whereas  the  numerical  processing  can  be  easily 
integrated,  this  is  not  true  for  the  nan-numerical 
processing.  As  a  matter  of  fact,  it  deals  mostly 
with  the  organization  of  the  Information.  It  does 
not  need  any  special  processor  but  it  is  expressed 
through  the  distribution  of  the  resources  in  the 
computer  architecture.  On  M3L,  a  special  attention 
was  paid  to  the  organization  of  the  resources  and 
in  particular  to  the  memories  management  ;  the  M3L 
architecture  is  based  upon  two  memories  :  the  pair- 
cells  memory  and  the  stack  memory. 

The  M3L  project  started  in  September  1977.  The 
prototype,  drawn  during  1979,  is  presently  in  the 
achievement  phase  and  will  be  operational  in  June 
19d0.  The  complete  machine,  with  the  input/output 
Interfaces  for  the  connecting  of  the  TTY  and  disks 
management,  is  made  of  five  boards  following  the 
European  standards.  The  prototype  is  equipped  with 
a  64  K  pair-cells  memory  and  a  16  K  stack  memory, 
representing  70  percent  of  the  chips. 

Tlie  architecture  of  M3L  Is  simple.  Just  like  the 
Von  Neumann  architecture,  it  varies  in  direct  ratio 
with  th:  size  of  the  memory.  Therefore,  it  can  serve 
as  a  basis  for  a  line  of  general  host  systems. The 
present  Implementation  corresponds  to  a  middle  need 
but  a  new  version  of  M3L,  with  a  virtual  pair-cells 
memory  is  studied  where  the  datapath  will  be  24-bit 
wide.  Just  like  the  Von  Neumann  architecture  it 
offers  a  systematic  approach  for  the  implementation 
of  the  direct  execution  scheme,  that  makes  it  easy 
to  understand  and  to  use.  Consequently,  it  bears  the 
required  features  for  a  large  diffusion,  From  that 
time  onwards,  there  is  no  doubt  that  such  an  archi¬ 
tecture,  and  more  generally  x-architectures,  will 
supersede  the  conventional  sequential  computer  sys¬ 
tems. 
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ABSTRACT 

Till U  paper  presents  a  methodology  of  definition 
of  a  high  level  machine  for  a  real  time  language. 

First i  the  choice  of  an  indirect  execution  computer 
architecture  for  this  class  of  language  is  discussed. 

Apart  from  the  algorithmic  aspect  already  exami¬ 
ned  in  previous  realisations,  this  type  of  language 
cruates  problems  of  management  in  a  multi-task  en¬ 
vironment,  of  definition  of  the  concept  of  interrup¬ 
tion  on  u  high  level  machine  and  of  implanting  com¬ 
plex  systems  which  require  a  structured  conception. 

An  application  of  the  defined  methodology  is 
described  which  consists  of  the  definition  and  rea¬ 
lization  of  a  high  level  machine  for  the  LTR  lan¬ 
guage,  insisting  on  the  implement  at  ion  of  problems 
specifically  linked  to  real  time, 

INTRODUCTION 

The  design  of  a  general-purpose  computer  usually 
precedes  the  design  of  the  software  tools  it  is  in¬ 
tended  to  support  ;  software  and  hardware  interfacing 
is  performed  by  instructions  in  the  machine  language 
manuging  the  physical  resources  of  the  computer. The 
Implementation  of  a  high  level  language  on  a  general 
purpose  computer  calls  for,  therefore,  the  presence 
of  translators  which  produce  (compilers)  or  usu 
(interpreters)  these  instructions, 

The  semantic  gap  between  the  external  form  of  u 
high  Level  language  and  the  machine  language  infers 
very  complex,  expensive  translators  which  are  not 
necessarily  free  from  errors. 

In  the  last  20  years  many  high  level  languages 
adapted  to  programmer's  needs  have  appeared  which 
have  been  implemented  with  the  help  of  compilers. 

At  presunt  a  large  number  of  high  level  langua¬ 
ges  exists  which  correspond  to  most  programming  needs. 

Tile  definition  of  a  data  processing  system  (com¬ 
puter  +  Language)  may,  therefore,  move  in  a  new  direc¬ 
tion  i  given  a  chosen  programming  language,  let  us 
define  a  computer  architecture  associated  with  this 

Language . 

Tills  approach  is  attractive  lor  two  fundamental 
reasons  : 

-  choice  of  the  language  which  best  expresses  the 
problems  to  be  dealt  with  (FORTRAN  for  scientific 
calculations,  COBOL  for  management  decisions, 

PASCAL  for  general  applications,  ...) 

-  the  efficiency  of  an  architecture  designed  speci¬ 
fically  to  support  the  language. 


In  recent  years,  many  studies  have  been  carried 
out  based  on  languages  which  are,  essentially,  algo¬ 
rithmic  (FORTRAN,  PASCAL,  EULER,  BASIC,  SYMB01 . ). 

The  study  described  in  this  paper  concerns  the  de¬ 
finition  of  an  architecture  specialized  in  the  exe¬ 
cution  of  a  real-time  system.  The  fact  that  a  real 
time  application  is  taken  into  account  introduces 
some  specific  problems  ! 

-  the  programming  system  iR  composed  of  very  nume¬ 
rous  (>  500)  interacting  programs  ;  therefore, on 
the  one  hand,  there  is  an  extremely  large  volume 
of  source  programs  (in  the  region  of  several  hun¬ 
dreds  of  thousands  of  instructions)  and,  on  the 
other,  the  problems  of  synchronization  between  the 
different  tasks  are  crucial 

-  task  switching  must  be  efficient  so  that  an  inter¬ 
nal  or  external  event  can  be  enable  as  quickly  as 
puss i hie 

-  the  computer  must  allow  separate  execution  of  t lie 
different  tasks  so  as  to  ensure  a  structuration 
of  the  appl icat ion. 

I  -  INDIRECT  EXECUTION  ARCHITECTURE 

Indirect  execution  architecture,  is  made  up  of 
two  distinct  parts  : 

-  a  software  module,  which  produces  an  intermediate 
language  based  on  the  source  language 

-  a  hardware  module,  which  execute  this  intermediate 
language . 

The  crucial  point  of  this  approacli  is  the  defi¬ 
nition  of  the  intermediate  language  (1ML)  which  must 
be  sufficiently  close  to  the  source  language  if  the 
compiler  is  to  remain  simple,  and  sufficiently  close 
to  the  hardware  If  t lie  execution  must  be  efficient. 

It  follows,  therefore,  that  there  cannot  he  an 
general  purpose  1ML  adapted  to  every  machine  langua¬ 
ge  and  architecture.  The  definition  of  such  an  archi¬ 
tecture  must,  therefore,  sturt  with  the  definition 
of  this  intermediate  level. 

Separate  module  compiling  thus  demands  existence 
of  a  linkage  editor  to  generate  an  executable  system. 

Two  solutions  may  be  envisaged  : 

-  tlie  edition  of  static  links  takes  up  the  concepts 
which  exist  on  conventional  machines  and  furnishes 
an  executable  module 

-  the  edition  of  dynamic  links  is  carried  out  at  the 
execution  time;  in  this  case,  when  the  resident 
system  meets  an  external  reference,  it  must  enter 
the  module  in  central  memory  and  start  the  execu¬ 
tion.  This  procedure,  which  includes  an  address 
computation,  is  time  costly. 
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Thu  choice  between  these  two  techniques  hope  mis 
on  the  source  language  organization  ami  the  rmis- 
tjraiuts  of  execution  time. 

Direct  execution  computer  architecture,  on  the 
other  hand,  can  support  the  execution  of  a  high  le¬ 
vel  language  without  any  change  of  the  original  text. 
This  approach  presents  many  advantages  (suppression 
of  all  the  software  system,  the  compiler,  the  lin¬ 
kage  editor,  the  loader  )  interactive  program  debug¬ 
gings^,?  )  for  a  certain  type  of  application!  this 
layout  seems  to  be  difficult  to  implement  for  com¬ 
plex  systems,  notably  for  multi-task  reul  time  sys¬ 
tems.  For  example  the  definition  of  interruptible 
points  in  such  a  layout  is  rather  delicate  :  an  in¬ 
terruption  can  be  enable  either  at  fixed  points  in 
the  execution  of  a  source  instruction  (at  the  begin¬ 
ning  or  al  the  end),  and,  in  this  case,  Lhu  masking 
time  may  become  too  lung  to  comply  with  the  system 
specifications,  or  at  each  unalysed  token  and,  in 
this  case,  the  processor  context  may  become  too  vo¬ 
luminous  and  context  switching  inefficient. 

2  -  HIGH  LEVEL  ARCHITECTURE  FOR  A  REAL  TIME  LANGUAGE 

The  need  for  efficient  execution,  the  management 
of  a  multi-task  environment  and  the  complexity  of 
the  real  time  systems  involved  lead  to  the  choice 
of  an  indirect  execution  architecture  to  support 
the  execution  of  these  systems. 

This  methodology,  essentially  interpret ive .com¬ 
bines  the  advantages  of  the  compiling  and  interpre¬ 
tation  techniques. 

The  source  text  is  translated  into  a  coded  text, 
compact  and  syntactically  correct,  whose  execution 
may  be  restarted,  postponed  or  linked  with  other 
modules. 

The  intermediate  text  is  interpreted  with  the 
help  of  microprogramming  techniques  on  a  data  path 
adapted  to  its  interpretation. 

This  methodology  avoids  the  two  bssic  reproaches 
which  are  levelled  at  compilation  and  interpretation. 
The  compiling  phaie  is  simple,  since  it  does  not 
realize  code  generation  and  optimization  us  in  clas¬ 
sic  compilers,  Moreover,  the  text  produced  is  inde¬ 
pendent  of  machine  resources  (memory,  registers,,,,) 
snd  the  semantics  of  the  instructions  are  close  to 
the  source  language. 

The  interpretation  of  euch  a  language  level  may 
be  efficient  thanks  to  microprogramming.  Classic 
programmed  interpreter!!  were  not  very  efficient  as 
they  were  in  the  central  memory  and  they  acted  on 
rudimentary  data  paths  (adders,  registers) . On  the 
other  hand,  a  microprogrammed  interpreter  is  in  con¬ 
trol  store  (with  an  access  time  about  10  times  fas¬ 
ter)  and  present  day  technology  allows  the  creation 
of  data  paths  better  adapted  to  interpretation, 

2.1.  Intermediate  machine  language 

The  compilation  phase  must  make  the  source  text 
directly  interprstable.  The  properties  of  these  DEL 
(Directly  Executable  Language)  have  been  largely 
defined  by  L.W.  HOEVEL?.  This  phase  comprises,  there¬ 
fore,  a  syntactic  and  semantic  analysis  of  the 
source  text,  symbol  processing,  processing  of  for¬ 
ward  references  and  labels  end  the  prefixing  (or 
poetfixing)  of  the  instructions.  This  processing 
may  be  defined  as  a  transfer  from  a  concrete  machine 
^source  text),  defined  by  a  concrete  graroaar,  to  an 
ebetrect  machine  (DEL),  defined  by  an  abstract  gram¬ 
mar,  used  by  the  interpreter  to  execute  the  abstract 


The  form  of  the  IML  is  determined  by  the  nature 
of  the  language  ;  however,  some  characteristics  may 
be  singled  out.  The  transfer  of  the  source  program 
into  the  virtual  machine  brings  about  an  environmen¬ 
tal  change.  An  intermediate  environment  may  be  com¬ 
posed  of  three  types  of  space  ! 

-  program  Bpace 

-  descriptor  space 

-  data  Bpace 

The  program  written  in  IML  is  a  finite  series  of 
binary  fields,  of  varying  length.  These  fields  are 
the  operation  codes,  operand  identifiers,  descriptor 
space  references  or  constants. 

The  descriptor  space  contains  all  the  semantic 
information  on  the  data,  and,  notably,  the  type  and 
the  access  mode  to  the  data  space. 

The  data  consist  of  information  of  varying  length. 
They  represent  arithmetic  values,  texts,  system  in¬ 
formation  (events,  semaphores)  or  procedural  para¬ 
meters, 

2.2.  Characterization  of  interpretation  processing 
Interpretation  processing  comprises  three types 
of  processing  : 

-  organic  processing  associated  with  the  management 
of  the  tasks  making  up  the  system  (activation- 
deactivation)  and  managing  the  machine  resource* 

-  formal  processing  associated  with  an  execution  1 
control  managing  the  execution  of  a  task 

-  effective  processing  associated  with  the  final  exe¬ 
cution  of  the  instructions. 

The  central  processing  unit  of  present-day  com¬ 
puters  are  defined  solely  to  the  execution  of  effec¬ 
tive  processing. 

A  high  level  architecture  must,  therefore,  be 
made  up  of  hardware  structures  in  order  to  support 
efficiently  formtl  processing  and  organic  orocessing. 
These  structures  must  permit  t  description  of  program 
algorithms  at  a  macroscopic  level  ;  that  is,  at  tha 
level  of  the  algorithmic  logic. 

Effective  processing,  on  the  other  hand,  permits 
a  description  of  the  algorithms  at  a  microscopic 
levei  ;  that  is,  at  the  level  of  functions  realization. 
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APPLICATION 

HlCil'LEVEL~ARCHITECTURE  FOR  THE  LTR  LANGUAGE* 

LTR  is  a  can  years  old  real  time  language  whose 
application  are  now  implemented  on  classic  computers 
(MITRA,  THIS,...)  through  the  intermediary  of  a  com¬ 
piler  which  produces  a  symbolic  text  which  must  be 
assembled  on  the  target  machine, 

This  implamentational  outline  is  not  very  effi¬ 
cient  at  the  compiler  level  nor  at  the  code  genera¬ 
tion. 

On  the  other  hand,  this  language  is  complete 
enough  to  be  able  to  express  most  of  the  problems 
of  a  real  time  application,  Therefore  it  has  been 
chosen  by  several  departments  of  the  French  Defense 
Departdetat  for  writing  real  time  systems. 

The  problem  is  the  definition  of  a  machine  ar¬ 
chitecture  which  can  support  its  sxscution  efficien¬ 
tly.  We  shall,  tharafora,  examine  sn  indirect  execu¬ 
tion  computer  architecture  to  execute  LTR  even 
though  this  is  a  compiler  oriented  language, 

i1’,  PRESENTATION  OF  THE  LANGUAGE 

LTR,  Real  Time  Language  (Langage  Temps  Rdel)  is 
^jhigh  level  programming  languaga  destined  for  sys¬ 
tems  realization.  It  presents  a  highly  structured 
organization  shown  by  a  partition  into  ARTICLES  at 
$1*  highest  level.  A  LTR  ayatam.is  a  set  of  ARTICLES. 

1.1.  Types  of  articles 

Data  articles  are  of  three  types  : 
w  DATA  ARTICLE  i  data  ahared  by  a  program  and  ita 
■!'r  subroutines 

^  GLOBAL  DATA  ARTICLE  i  date  common  to  the  system 
data  set 

SYSTEM  DATA  ARTICLE  :  date  specific  to  the  system 
,v  environment, 

The  processing  articles  describe  tha  Algorithms 
concerning  tha  data  declared  in  the  date  articles 
Ofin  the  processing  erticlee. 

era  tfftsi*  types  of  processing  articles  t 

-  PROCEDURE  ARTICLE  i  corresponds  to  tha  concept* 

of  subroutines  or  functions 

-  PROCESS  ARTICLE  ;  describes  a  process  running  in 

a  multi-task  context  (concept  of 
software  teak) 

-  INTERRUPT  PROCEDURE  ARTICLE  :  describes  a  process, 

whose  execution  is  tied  to  the  in¬ 
terruption  system  (concept  of  an 
immediate  task), 

1.2.  Structure  of  a  LTR  system 

Figure  I  describe a  a  LTR  system  ;  the  separate 
compilation  of  a  task  may  be  carried  out,  the  com¬ 
pilation  unit  being  : 

<SYSTE1,  DATA  ARTICLEXGLOBAL  DATA  ART  ICLl'>*<  EXTERNAL 
GLOBAL  PROCEDURE>*<EXTERNAL  PROCESS>*<task^ 

Program  procedures  may  be  called  only  by  those 
of  the  same  task. 

A  task  may  activate  another  taBk  and  take  back 
control  at  the  end  of  execution  (closed  cull)  or 
Lose  this  control  to  the  advantage  of  a  tusk  with 
higher  priority  (open  call). 

*  This  work  is  supported  by  the  Direction  des  Recher- 
chea  et  Etudes  Techniques  (ORET)  of  the  French  De¬ 
fense  Department,  at  the  department  of  Computer 
Science  of  the  Paul  Subutier  University  anil  L tie  De¬ 
partment  ot'  Computer  Engineering  (ONERA-CERT)  ol 
the  Centre  U' Etudes  ct  de  lleche  relics  do  I'ou  louse. 


The  implemented  system  must  ensure  local  proce- 
dure  recurBivity  and  task  reentry. _ 


GLOBAL  ARTICLES 

SYSTEM  DATA 
GLOBAL  DATA 

{GLOBAL  PROCEDURES 


PROCESS 

PROCESS i 

DATA  ARTICLES 

»  1  » 

DATA  ARTICLES 

PROCEDURE  ARTICLES 

PROCEDURE  ARTICLES 

ARTICLES  PROCESSES 


IN IT  PROCESS 
DATA  ARTICLES 

PROCEDURE  ARTICLES 


ARTICLE  START  PROCESS 


Fig. 1  !  STRUCTURE  OF  A  LTR  SYSTEM 


The  range  of  the  identifiers  outside  the  proces¬ 
sing  article  is  as  follows  : 

.  the  only  accessible  data  are  those  declared  in  : 

-  the  task  DATA  ARTICLES 

-  GLOBAL  DATA  ARTICLES 

-  the  parameters 

,  the  only  usable  ones  are  : 

-  the  task  PROCEDURE  ARTICLES 

-  the  GLOBAL  PROCEDURE  ARTICLES 

Inside  the  article,  the  classic  block  structure 
rule*  must  be  respected. 

1.3.  Principle  of  data  allocation 

In'  LTl<,  lead  to  different  data  storage  alloca¬ 
tion  the  type  of  article  and  the  data  organization. 

A.  Static  and  permanent 

These  are  the  data,  tables  or  structures  decla¬ 
red  in  a  GLOBAL  DATA  ARTICLE  or  in  a  DATA  ARTI¬ 
CLE.  The  store  space  is  reserved  by  the  compiler  and 
life  expectation  is  linked  with  that  of  the  task. 

B.  Automatic  allocation 

These  ure  the  data,  tables  or  structures  locally 
declared  in  the  processing  articles.  The  data  are 
dynamically  initialized  and  data  overluy  takes  place 
according  to  the  block  structure.  Life  expectation 
iB  linked  to  the  internal  block  in  which  they  are 
dec  lared . 

C .  Controlled  allocation 

This  concerns  virtual  data  pointed  by  che  user. 
The  data  ure  described  in  a  data  or  processing  arti¬ 
cle  :  links  between  Lhe  description  and  the  data  zone 
to  which  they  apply  is  realized  by  the  execution  of 
pointer  manipulation  instructions  or  by  storage 
al locat ion . 

1) .  Clia i n  al  locat  ion 

This  concerns  sets  pointed  by  the  user  but  whose, 
chaining  is  automatically  ensured  by  the  allocation 
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CODE 

Param  1 

Param  2 

Param  3 
and  others 

FONCTION 

Notes 

AFF 

operand 

opde  or 
constant 

Affectation 

ADD 

operand 

opde  or 
constant 

opde  or 
constant 

Param  1  *  Param  2  -  Param  3 

(1) 

LSS 

operand 

operand 

Comparison  of  Params  2-3  and  affectation  of  result 
(booleen)  to  Param  1 

(1) 

IF 

address  1 

address  2 

address  3 

(f  0 

(2) 

FOR 

address  1 

address  2 

(f  2) 

(2) 

WHILE 

address  1 

address  2 

(f  3) 

(2) 

CALL 

operand 

(opde  or 
constant) 

Param  1  :  descriptive  of  procedure 

Param  3  s  parametsr  list 

CALLP 

operand 

entry  TD 

Entry  TD  :  address  of  a  TASK  DESCRIPTOR 

Params  2- 3  :  identical 

NEW 

operand 

operand 

opde  or 
constant 

Insertion  of  an  element  in  a  set 

Param  1  :  set 

2  :  insertion  address 

3  :  name  of.  element  to  be  inserted 

(f  1)  IF  <a,><*2><«3>  <exp . bool ,block>  <THEN  block>  <EL$E  block> 

(£  2)  FOR  <a1><a2>  <incr. block + te»t>  <FOR  block> 

(f  3)  WHILE  <a|><a2>  <exp.bool . block>  <WHILE  block> 

(1)  Parameter  I  may  be  an  Intermediate  variable  produced  by  the  compiler 

(2)  The  addreeeee  are  N-upla  addraaaea _ 


machaniim.  The  data  ara  deacribed  in  a  GLOBAL 
DATA  ARTICLE. 

Thia  presentation  of  the  language  fixes  the 
cpnttrainta  on  defining  memory  management  for  a  LTR 
machine.  We  ahall  present  the  solution  chosen  for 
implementing  such  a  system  below, 

2,  INTERMEDIATE  LANGUAGE  (DEL)  FROM  LTR 

An  intermediate  instruction  is  a  byte  chain 
of  varying  length  called  N-uple.' 

A  N-uple  may  be  an  expression  (OPERATOR, (OPE¬ 
RAND)*)  in  which  the  number  of  operands  is  fixed 
only  by  the  LTR  instruction  specifications. 

Definition  of  the  operator  codas  is  fixed  by  the 
LTR  instructions  ;  each  instruction  has  been  regrou¬ 
ped  in  the  form  of  an  N-uple,  at  the  same  time  con¬ 
serving  all  the  semantic  contained  in  the  source 
instruction, 

The  upper  table  gives  some  examples  of  N-uplss. 

In  the  operand  pert,  we  may  find  either  ,i  cons¬ 
tant,  an  N-uple  address,  or  a  data  descriptor  ad¬ 
dress.  The  operand  is  prefixed  by  a  directive  which 
prescribes  the  descriptor  type  s 


CONV  :t-  cv  NUMBER 

MOD  tl-  ct, (CTSI/OPDE.CTSI/OPDE) /pt,H. S.H. ,CONV 

INDEX  !!•  ix , H . S . H. / index! , CTKi 

CTSI  : !■  ct.CTEi 

H.S.H.  i !■  address  of  descriptor 

CTEi  inaediate  constant 

This  intermediata  form  is  very  close  to  the 
source  language.  The  semantic  information  contained 
in  an  LTR  Instruction  has  been  coded  in  the  inter¬ 
mediate  instruction  so  as  to  facilitate  interpreta¬ 
tion  t  the  interpreter  will  enalyse  instruction  pre¬ 
fixing  by  operationsl  code,  execution  and  control 
addressea  and  operand  directives. 

All  non-constant  variables  are  addressed  through 
a  descriptor  which  contains  the  information  set  cha¬ 
racterising  the  data  used  by  the  interpreter. 

The  basic  descriptor  is  a  10  bytesword  which  maj! 
have  extensions  for  complex  operands  (table,  struc- I 
ture,  process  descriptions).  In  the  standardized 


NAME 

INDIC 

BASE 

DEPL. 

TYPE 

STRUCT 

SIZE 

SCALE 

EXT 

(ch) 

1 corchain'  for 

bit  chains 

(ix) 

table  iudex 

(rf) 

reference  of  e 

structure  field 

(pt) 

pointer  to  a  eat 

(ct) 

couatant 

(op) 

operand 

(cv) 

oonveraion 

The  DEL-LTR  may  be  summarised  schematically  as 
follows  : 

IMI  ;:«■  (N-uples)* 

N-uple  :t-  (OPCODE, (OPERAND)*) 

.  OPERAND  ti-  (CTSI) (OPDE) , (CONV /MOD, (CONV)) 

OPDE  is-  op, H.S.H. , (INDEX) 
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name  of  the  variable, this  information  allows 
the  editing  of  the  state  of  the  variables  du¬ 
ring,  the  debugging  phase 

INDIC  ;  data  implantation  type  :  global, local, parameters 
BASE-DEPLACEMENT  :  data  implantation  address 
TYPE:  Integer,  Real,  Fixed,  Index,  Character  string, 
logic,  boolean,  quality,  static  reference, vir¬ 
tual  data  reference,  set  element  reference 
STRUCT  :  *rrey,structure, structure  array, virtual  date, 
set 

SIZE  :  space  occupied  by  the  data 
SCALE  i  normalisation  factor 

EXT  ;  pointer  to  an  extension  descriptor  i 


3.  LTR  PROCESSOR  STRUCTURE 

"T^ie  LTR  processor  structure  follows  from  the  me¬ 
thodology  described  above, 

The  processor  is  composed  of  two  pipe-line  units, 
one  for  macro-interpretation  processing  (MAI),  the 
second  for  micro-interpretation  processing  (Mil) 

■  CfU.2). _ 


Fig. 2  :  LTR  processor  block-diagram 


The  central  memory  is  divided  into  three  physi¬ 
cally  separate  memories  : 

-  the  N-uple  memory  contains  the  intermediate  code 
and  is  accessible  to. the  MAI  processor  only 

-  the  descriptor  memory  contains  the  data  descrip¬ 
tors,  systems  data  and  processes  :  it  is  accessi¬ 
ble  to  the  Mil  processor  only 

-  the  data  memory  contains  the  data  described  in 
the  source  program. 

A  N-uples  is  interpreted  in  two  phases  ; 

-  tin'  first,  in  the  tnacro-interpruLur  (MAI)  .manages 
the  IML  execution  control  j  it  divides  a  N-uple 
into  simple  instructions  which  it  sends  to  the 
micro-interpreter  (Mil) 

-  the  second  phase,  therefore,  takes  place  in  the 
micro-interpreter  (Mil)  which  merely  executes, 
sequentially,  the  actions  send  by  the  MAI  :  search 
for  operand  descriptor,  conversion  of  a  number, 
arithmetic  operations  ...  j  these  actions  corres¬ 
pond  to  a  sat  of  microprograms  contained  in  the 
Mil  control  store. 

The  connection  between  the  two  units  is  reali¬ 
zed  through  the  intermediary  of  two  hardware  queues: 
a  parameter  queue  and  a  action  number  queue.  Moreo¬ 
ver,  state  variables  and  calculation  results  may 
transit  between  the  two  units. 

The  two  queues  allow  a  synchronization  of  the 
two  processors  and  ensure  pipe-line  nw.agement . 


-  the  N-uples  memory  has  read  access  over  4  bytes  ; 
the  descriptor  memory  has  a  double  read/write  ac¬ 
cess  also  over  10  bytes  ;  the  first  contains  the 
descriptor  and  the  second  the  context  of  the  micrc- 
machine 

-  the  data  memory  has  a  read/write  access  over  two  • 
bytes,  the  size  of  the  data  path  being  16  bits. 

The  scheduling  algorithm  occurs  on  the  Micro 
Interpreter  which  sends  a  task  number  to  the  MAI  ; 
the  context  set  is  described  in  the  CONTEXT  section. 

3.1.  Macro  Interpreter  Structure  (fig. 3) 

The  macro-interpreter  supports  the  formal  and 
organic  processings  attached  to  the  system  execution 
control.  Formal  processing  amounts  to  management  of 
the  N-uple  ordinal  counter  (management  of  the  recur- 
aivity  of  IML  instruction)  and  organic  processing 
concerns  procedure  context  switching.  A  context  swit¬ 
ching  may  occur  on  two  types  of  event  : 

-  switching  on  interruption 

-  switching  on  process  call 

In  the  first  case,  the  interrupted  process  con¬ 
texts  may  be  managed  in  stacks  ;  interruption  mecha¬ 
nism  can  he  implemented  according  to  a  hierarchic 
algorithm. 

When  the  process  attached  to  the  interruption  of 
level  takes  place,  it  can  be  interrupted  only  by  an 
interruption  of  level  j  (j  >  i)  j  control  will  be  re¬ 
turned,  after  processing  of  level  j,  to  the  level  i 
process  or  to  a  process  with  a  higher  priority. 

This  mechanism  may  be  implanted  with  the  help  of 
just  one  stack,  the  summit  context  being  the  active 
context. 

On  the  other  hand,  for  process  activated  by  an 
open  call,  it  is  possible  to  avoid  returning  to  the 
calling  process.  A  stack  must,  therefore,  be  alloca¬ 
ted  to  this  process  and,  during  switching,  the  num¬ 
ber  of  the  stack  containing  the  caller's  context  must* 
be  saved  .  The  task  is,  then,  executed  in  its 
own  stack  space,  For  all  closed  calls,  the  context 
may  be  safeguarded  in  the  active  stack  (mechanism 
identical  to  that  of  activations  on  interruption)and 
for  pseudo-open  calls  (an  open  which  return  control 
to  the  culling  process)  two  stack  BpacuB  are  suffi¬ 
cient. 

We  allow  for  16  stock  spaces  (13+  interruption) 
which  permit  an  interleaving  of  15  open  calls  wilhoul 
return  to  the  calling  process.  The  size  of  each  spuce 
is  assessed  at  1  Kwords,  This  space  and  the  manage¬ 
ment  mechanism  are  represented  by  stack  II.  The  micro¬ 
interpreter  context  will  be  switched  at  the  top  of 
the  active  stack,  the  active  stack  being  found  in 
the  process  descriptor. 

Ordinal  counter  management  is  ensured  by  a  reen¬ 
trant  microprogrammed  interpreter  whose  essential 
functions  are  : 

-  access  to  the  source  text 

-  analysis  of  the  instruction  operation  code 

-  to  break  up  an  N-uple  into  elementary  ACTION 
functions . 


The  division  of  the  stores  in  function  of  the 
information  they  contain  allows  a  real  parallelism 
between  the  different  accesses  and  also  pariicula- 
risation  of  each  access  s 
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Fig.  5  :  THE  SCHEDULING  PRINCIPLE 


Ex ampin  :  Interpretation  of  an  IF  inetruction, 

When  the  operation  code  is  decoded,  the  inter- 
pretation  consists  in  t 

-  stacking  the  three  addresses  ^xa^xa^  in 
stack  II  of  the  active  procedure 

-  loading  <a  »  onto  the  CPT  register 

.-  calling  a  rule  <boolean  exprossion>  .  (I) 

The  end  of  the  <Expbool  block>  is  supplied  by 
the  comparator  which  determines  the  egality  between 
.the  CPT  register  and  the  instruction  counter  (IC) . 

Depending  on  the  value  of  the  boolean  transmit¬ 
ted  by  the  HU,  address  a  2  loaded  onto  the  IC 
[and  address  aj  is  loaded  onto  the  CPT  (value  0)  or 
[address  a 2  is  loaded  onto  the  CPT  and  the  IC  regis¬ 
ter  is  not  affected  (value  I).  At  the  end  of 
<block  THEN>  ,  address  83  is  loaded  onto  IC, 

(i)  This  call  is  carried  out  by  stacking.  AR  onto 
Stack  I  and  the  return  of  the  rule  provokes  a  pop 
operation.  This  mechanism  allows  an  interpretation 
of  the  language  in  accordance  with  e  method  of  des¬ 
cending  enelyaie. 


3.2,  Micro-Interpreter  Structure  (fig. 4) 

The  micro-interpreter  Ib  the  CPU  of  conventional 
computers.  It  is  composed  of  a  control  store  contai¬ 
ning  the  set  of  interpretation  microprograms  and  a 
data  path  formed  by  an  arithmetic  logic  unit  (AMD 
2903)  and  a  Bit  Pattern  Manipulator  (BPM)  capable 
of  performing  logic  operations  on  bit  sets  (permu¬ 
tation  of  bytes,  extraction  and  scaling  of  bit  fields, 
concatenation) ,  The  Mil  manages  access  to  descriptor 
end  date  stores  and  executes  the  part  of  organic 
processing  relative  to  the  management  of  the  data 
space  of  a  procedure. 

The  access  register  of  the  descriptor  store  is, 
in  fact,  a  local  memory  composed  of  three  blocks  of 
ten  bytes.  This  memory  constitutes  an  extension  of 
the  internal  registers  to  microprocessor  AMD  2903. 

The  first  block  contains  e  procedure  descriptor 
or  e  date  descriptor,  the  second  may  contain  a  data 
descriptor,  and  the  third  contains  the  Mil  context. 
We  shall  see  in  the  CONTEXT  section  that  this  solu¬ 
tion  allows  an  optimisation  of  context  switching. 


-iTtfeV.ViS:. 


4.  STORE  MANAGEMENT 


4.1.  Data  store managenant 


Logically,  this  store  should  be  managed  in  such 
a  way  that  tha  implantation  of  data  and  way  of  acce¬ 
ding  to  it  should  be  directly  deducible  from  tha  LTR 
system  structure  and  from  the  constraints  quoted  in 

(I). 

The  structuration  of  the  program  into  ARTICLES 
suggests  an  addressing  in  relation  to  different 
bases.  This  technique  allows,  moreover,  the  defini¬ 
tion  of  a  protection  for  each  segment,  an  important 
factor  in  the  real-time  field. 


It  will,  however,  be  necessary  to  allow  for  di¬ 
rect  addressing  in  particular  for  tha  passage  of  pa- 

rassaters  by  address. 


Since  the  LTR  processor  takes  the  recursion  and 
reentry  of  the  procedures  and  processes  into  account, 
it  leads  us  to  allocate  a  stack  for  each  process 
where  the  contexts  of  each  procedure  call  will  be 
conserved  and  local  data  of  the  called  procedure 
will  be  created. 

It  can  ba  saan  that  the  basic  addressing  is  not 
sufficient  to  manage  the  memory  efficiently,  There 
is  a  possibility  of  a  proliferation  of  sonas  of  dy¬ 
namically  created  data.  It  follows  that  it  will  be 
difficult  to  recover  the  free  space  and  for  this 
reason  we  have  added  to  the  addressing  system  a  sys- 
igesi  of  storage  allocation  by  paging  and  "topograr 
jihic"  store. 

However  we  have  also  tried  to  adapt  the  addres¬ 
sing  mode  to  the  type  of  accessed  data  by  addresr 
sing  directly  the  global  data,  whose  life  expectan¬ 
cy  is  that  of  the  system,  and  reserving  topographic 
addressing  for  data  with  a  shorter  life.  The  cha¬ 
racteristics  of  these  different  cones  are  determi¬ 
ned  by  the  requirements  of  the  LTR  system  to  be 
executed. 


To  sum  up,  we  have  allowed  for  the  following 
addressing  modes,  which  appear  in  the  descriptions 
of  the  system  variables  : 

-  general  direct  addressing,  for  the  use  of  data 
declared  in  GLOBAL  DATA  ARTICLE 

-  direct  addressing  for  the  use  of  the  process  or 
procedure  call  parameters  and  also  the  sets 

-  topographic  addressing,  localised  in  the  process, 
for  the  use  of  data  declared  in  DATA  ARTICLE 

-  topographic  addressing,  localised  in  the  proce¬ 
dure,  which  interests  tha  process  stack,  for  the 
use  of  data  declared  in  a  PROCEDURE  ARTICLE,  glo¬ 
bal  or  not, 


Different  address  calculations 

Let  ffOPO  be  the  function  calculating  the  real 
address  of  a  variable  from  its  virtual  address. This 


association  function  consists  in  replacing  the  vir¬ 
tual  page  number  by  the  real  page  number.  This  as¬ 
sociation  is  realised  during  storage  allocation,  by 
the  operating  system  and  is  materialized  by  a 
"topographic"  store.  The  list  of  pages  allocated  to 
a  process  is  part  of  its  context. 


An  address  of  this  type  is  always  ceptainpd  in] 
a  pointer  :  , 

-  Calculation  of  a  process .local  address  (PS^)  i 

a-ftOPO  ((base  L)  +  deplaceaent) 

-  Calculation  of  a  procedure  local  address  (PDA)  ' 


Descriptor  addressin 


The  data  of  a  program  are  referencad  in  the  co4e 
through  the  intermediary  of  a  descriptor.  It  is  iml 
planted  in  a  memory  10  bytea  wida  and  addraseabla  ! 
on  64  K.  However,  in  order  to  eimplify  program  de¬ 
bugging,  the  LTR  aource  text  msy  be  compiled  by  mo¬ 
dules  (an  executable  system  msy  be  composed  of  se¬ 
veral  modules) .  The  solution  clsssically  adopted  in 
machine  language!  to  assemble  the  different  modulee 
consists  in  making  the  process  linking  dynamically. 
We  have  not  retained  this  solution  as  it  has  proved 


Therefore  : 

-  Calculation  of  a  general  direct  address  (GDA) 

a  -  (base  G)  +  displacement 

-  Calculation  of  a  reference  direct  address  (DA) 

a  -  deplacement 


to  be  too  time  costly  in  execution  and  considerably 
increases  the  system  overhead  time.  We  have,  there¬ 
fore,  chosen,  to  address  the  descriptors  by  (base, 
deplacement).  Therefore,  at  a  given  moment  we  have 
three  bases  i 

-  Base  of  Global  data  descriptors 

-  Base  of  Data  descriptors 

-  Base  of  local  data  descriptors  for  active  procedure. 
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The  values  of  these  bases  are  determined  when 
loading  the  blocks  they  reference.  It  is  to  be  noted 
that  these  bases  are  an  integral  part  of  the  process 
context . 


4.3.  Implementation  of  data  systems 

We  shall  now  examine  the  solutions  adopted  for 
the  implementation  of  the  system  processors, notably: 
the  scheduler,  management  of  events  and  semaphores 
and  interrupts. 

4.3.1.  Processor  implantation 

The  processors  monitors  are  microprograraned  and 
run  on  the  micromachine.  The  data  manipulated  by 
these  programs  are  implanted  in  the  form  of  des¬ 
criptors,  for  protection  purposes.  In  effect,  only 
the  microprograms  are  authorized  to  write  in  the 
descriptor  store  during  the  execution  of  a  system. 
These  processors  manipulate  descriptor  strings. 


4.3.2.  lmpiantation  of  scheduler  data 
The  scheduler  manipulates  process  descriptors. 
These  descriptors  have  the  following  structure. 


NAME 

INDICti 

LAV 

LAR 

EXT  BASE 

CODE  BASK 

DESC  DATA 

BASE 

PROG 

SPACE 

PR01NIT 

STACK  NUMBER 

NAME  :  pointer  towards  process  identification 
INDIO  :  process  current  state  word 
l.AV-LAR  :  stringing  of  process  in  queues 
EXT  :  pointer  towards  an  extension 
BASE  CODE  :  address  of  code  implantation 
BASE  DESC  DATA  i  address  of  data  descriptors 
BASE  PROD  SPACE  :  address  of  data 

PR01N1T  :  pointer  towards  procedure  status  descriptor 

The  scheduler  manipulates  only  the  CU  proces¬ 
sor's  queue  (ready  processes).  In  effect,  the  other 
lists  are  manipulated  by  the  other  system  proces¬ 
sors  which  will  return  control  to  the  scheduler  at 
the  end  of  their  execution.  The  head  of  this  list 
is  represented  by  a  descriptor  implanted  in  a  fixed 


address  with 

the  form 

FIRST  LAST 

NB  LIST 

NB  CREATED  | 

NB  ACTIVE 

FIRST, LAST  : 
NB  LIST  : 
NB  CREATED  '. 

reference  points  on  the  list 
number  of  processes  in  the  list 
number  of  processes  created 

NB  ACTIVE  :  number  of  active  processes  at  present. 


4.3.3.  Event  and  semaphore  management 
We  first  decided  not  to  implant  event  expression 
resolution.  Our  choice  was  motivated  by  the  com¬ 
plexity  of  such  a  resolution  and  the  multiplication 
of  hardware  it  would  cause.  We  have,  therefore, 
grouped  the  processing  of  events  and  semaphores. 

The  physionoiuy  of  the  descriptors  manipulated  is  as 
follows  : 


NAME 


VALUE 


TYPE  FIRST  LAST 


NAME  :  pointer  towards  the  semaphore  or  event 
identifier 

VALUE  :  value  of  an  instant  of  the  variable 

TYPE  :  event/semaphore 

FIRST, LAST  :  processor  queue  reference 


4.3.4.  Interruption  management 
The  interruptions  are  materialized  by  a  des¬ 
criptor  witli  the  form  : 


1 NTERUPT 


STATE 


ATTACH  PROCESS  ADDRESS 


The  IT  descriptors  are  implanted  in  addresses 
equal  to  running  level  (IT  N*  i  •*  descriptor  of 
address  i) .  When  an  IT  is  enable,  the  IT  processor 
inserts  the  process  at  the  head  of  the  queue.  The 
scheduler  takes  control  and,  if  necessary,  activa¬ 
tes. 

This  processing  concerns  IT  directly  connected 
with  a  task. 


5.  INTERRUPTIBILITY 

The  definition  of  interruptibility  at  a  "logic" 
level,  that  is,  at  the  level  of  the  intermediate 
language  and  the  macro-interpreter,  is  very  deli¬ 
cate,  or,  even,  impossible,  given  the  contextual 
interpretation  mode  we  have  chosen.  An  "instruction',' 
or  execution  unit,  at  this  level,  is,  in  effect, 
something  of  variable  length,  and  may  even  be  the 
program  itself. 

The  concept  of  point  of  interruptibility  must, 
therefore,  be  more  closely  defined,  even  if  the 
macro-interpreter  level  presents  the  interest  of 
reducing  context  volume  to  a  minimum  when  enable 
the  interrupt- 

The  division  of  an  N-uple  by  the  Macro-Inter¬ 
preter  into  ACTIONS  permits  the  interruptible  points 
to  be  fixed  at  the  beginning  of  each  ACTION.  This 
choice  establishes  a  compromise  between  the  volume 
of  information  to  be  saved  and  the  time  needed 
to  set  up  this  safeguard,  in  effect  : 

-  The  fastest  possible  takeover  of  the  interrupts 
will  have  for  effect  the  switching  of  a  larger 
number  of  data,  therefore  an  effective  time  such 
that  this  politic  is  in  danger  of  losing  its 
interest 

-  A  takeover  defered  until  certain  key  moments  in 
the  execution  of  a  program  will  entail  t he  mani¬ 
pulation  of  a  smaller  amount  of  data  and  may, 
therefore,  be  more  efficient  than  immediate  pro¬ 
cessing. 

Moreover,  at  the  beginning  of  ACTION,  MAI  con¬ 
text  is  at  a  minimum.  However,  to  justifie  this 
choice,  the  execution  time  of  an  action  must  remain 
compatible  with  the  requirements  of  interrupt 
processing. 

6.  CONTEXT 

Given  the  machine  structure  we  have  described, 
this  context  will  be  larger  than  that  found  on  a 
conventional  machine.  It  is,  moreover,  spread  over 
several  functional  units  and,  thus,  may  be  divided 
into  three  parts  : 

-  task  characterisation  context 

-  macro-interpreter  context 

-  micro-interpreter  context 

6.1.  Task  char acterisation 

This  is  the  part  of  the  context  which  is  clo¬ 
sest  to  the  information  found  on  a  classic  machine. 
It  defines,  both  the  identity  of  the  process  and 
its  wurk  space  for  anything  concerning  the  Jata 
muni  pul  a  ted . 

Definition  of  process  identity  includes  the 
following  information  : 

NAME  :  pointer  to  the  name  of  the  process 
ADCODK  :  process  start  address 

NIT  :  Lied  number  oi  interrupt 
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This  information  will  be  contained  in  a  speci¬ 
fic  location  in  the  descriptor  memory. 

The  definition  ol  process  work  space  includes 
the  following  informations: 

ADDKSC  :  description  space  base 
STACK  :  number  of  the  execution  stacks  in  the 
Macro- Interpreter 

BASE  1  (G)  :  process  global  data  base 

BASE  2  (.L)  :  process  local  data  base 

BASE  3  (2)  :  local  data  base  for  running  procedure 

BASE  4  address  of  page  table  for  Lite  process. 

The  type  ol  topographic  implantation  chosen 
tsee  abov.e)  calls  for  the  constitution  of  corres- 
pondance  tables,  virtual  pages  -»  real  pages,  proper 
to  each  process.  During  execution  of  a  process  this 
table  is  loaded  in  a  specialized  memory  and  must 
exist  in  memory  so  that  it  can  be  reestablished 
after  interruption  followed  by  context  switching. 

6.2 .  Macro-Interpreter  context 

The  execution  of  a  process  brings  about  an 
evolution  of  the  information  contained  in  the 
macro-interpreter,  characterizing  the  logical  evo¬ 
lution  of  interpretation. 

This  information  also,  may  be  put  in  three 
parts  : 

-  Propram  context 

,  1C  :  instruction  counter  of  the  program  in  IML 
.  OPT  :  address  of  end  of  block  under  examination 
.  STACK  11  and  TOP  2  ;  address  stack  for  the  end 
ot  the  included  block  and  its  pointer 

-  Interpretation  nnnt-pxt 

T” AR  :  address  register  on  interpretation  program 
.  .  STACK  2  :  return  address  stack  at  the  end  of 

the  decoding  submicroprogram 

-  Stats.  o£  .cpmBunicatiQn  with  ths_migronachiDe 
.  Generated  actions  queue  and  its  pointers 

.  Queue  of  parameters  to  be  transmitted  and  its 
pointers. 

6.3.  Microucciti.it.  context 

The  v.  we  of  significant  context  in  the  micro¬ 
machine  hiu.  itn  reduced  considerably  by  the  fact 
that  the  interrupts  are  euabled  between  two  ac¬ 
tions,  as  we  have  said  above, 

The  information  to  be  saved  are  the  five  re¬ 
gisters  making  up  the  external  register  of  the 
Cll  29C3.  These  registers  are  used  to  transmit  the 
parameters  between  the  various  actions.  It  is  to 
be  noted  that  as  this  extension  is  in  direct  access 
with  the  descriptor  memory,  its  content  is  saved  in 
a  single  memory  cycle. 

This  information  will,  therefore,  be  saved  m 
the  space  descriptor  of  the  interrupted  process. 

CONCLUSION 

The  high  level  computer  architectures  previous¬ 
ly  studied  or  realized  concerned  monotask  langua¬ 
ges.  This  study  shows  the  principal  problems  met 
in  the  implementation  of  a  multi-task  real  time 
language . 

Interpretation  processing  has  been  divided 
into  three  classes  : 

-  organic  processing  associated  with  the  management 
of  a  multi-task  systmo 

-  formal  processing  associated  with  tfc*.  control  of 
one  task 


-  effective  processing  associated  with  the  execution 
of  each  instructions  of  one  procedure. 

The  hardware  structure  has  been  designed  to  sup¬ 
port  efficiently  these  three  kinds  of  processing. 

The  realization  of  a  prototype  able  to  support 
the  LTR  language  should  allow  the  validation  of 
tiiese  concepts. 
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v  ■  Abstract.  We  introduce  an  architecture  which  performs 
many  ol  the  optimizations  commonly  seen  In  sophisticated 
compilers  f6r  high-level  languages,  Including  redundant 
'  expression  elimination  and  the  movement  of  Invariant 
•,  expressions  out  ol  loops.  The  instruction  set  of  this 
'  machine  allows  simple  compilers  to  produce  a  graph- 
•  structured  object  code  which  Is  both  compact  and 
efficient.  The  architecture  features  a  cache  which  records 
|  the  values  and  dependencies  of  HLL  expressions  In  order 
to  avoid  later  recomputations  and  memory  references. 
Preliminary  experimental  results  Indicate  a  speedup 
npproaching  n  factor  of  two  over  a  pure  stack  architecture 
on  some  programs. 


I.  Introduction 

The  arguments  in  favor  of  closing  the  "semantic  gap" 
between  source  program  and  oblect  program  are  well  known 
:  by  participants  of  this  conference.  Myers  [1  ]  characterizes  the 
job  of  the  computer  architect  as  determining  the  proper 
division  of  total  system  functionality  between  software, 
firmware,  and  hardware.  Two  extremes  of  this  division  are 
possible.  At  one  extreme  we  have  traditional  architectures 
which  tend  to  leave  too  much  to  the  software  and  are  ill-suited 
lo  the  software  they  execute.  Complex  operating  systems  are 
necessary  to  make  them  useful;  complex  compilers  are 
necessury  to  make  high-level  languages  (Fills)  execute 
efficiently.  At  the  other  extreme  we  have  architectures  which 
attempt  to  execute  high-level  languages  directly.  These 
architectures  are  often  Inefficient  ihemselves;  program 
representations  appropriate  for  programmers  are  not  always 
appropriate  tor  computers.  It  is  likely  that  better  cost- 
performance  can  be  achieved  by  an  architecture  which  falls 
somewhere  between  these  extremes.  Our  architecture  is  one 
of  many  such;  it  is  aimed  at  reducing  or  eliminating  the  need 
(and  hence  the  costs)  of  optimizing  compilers  by  performing 
important  optimizations  in  hardware.  It  does  not  directly 
address  other  dimensions  of  the  pioblem,  such  as  the 
complexity  ol  operating  systems. 

The  total  cost  of  optimizing  compilers  Is  great.  Their 
construction  is  a  formidable  software  engineering  task,  The 
code  they  produce  is  almost  aiwavs  obscure,  occasionally 
worse  lhan  no  optimization,  and  sometimes  just  plain  wrong. 
They  also  execute  more  slowly,  and  hence  exact  a  price  on 
each  compilation.  Research  is  underway  in  several  places 
aimed  at  reducing  this  cost  through  the  automatic,  or  semi¬ 
automatic,  generation  of  such  compilers  |2].  Our  approach  to 
this  problem  is  dillerent;  we  are  trying  to  raise  the 
hardware/software  interface  above  'he  level  nf  the  compiler's 
optimization  phase,  thus  reducing  the  compiler's  task  to 
(mainly)  lexical  analysis  and  paising.  Efficient  algorithms  for 
these  phases  are  known,  and  the  automatic  construction  of 
‘such  compilers  would  be  within  our  grasp, 

Our  architecture  is  able  to  perform  two  common  and 
important  optimizations:  redundant  exorossion  elimination  and 
;  a  type  of  code  motion  typified  by  the  movement  of  invariant 


expressions  out  of  loops.  These  optimizations  traditionally 
require  sophisticated  flow  analysis  during  compilation,  so  their 
elimination  from  compilers  should  be  benelicial,  Our  research 
is  aimed  at  determining  how  big  an  impact  this  architecture 
can  have  on  the  total  cost-performance  ol  a  compiler- 
architecture  pair. 

In  this  paper  we  will  introduce  the  architecture  and  argue 
its  advantages  informally  and  by  example.  Other  work  is  under 
way  to  determine  the  architecture's  quantitative  benelits  over  a 
range  ol  real  proqrams.  Because  we  are  interested  in  basic 
feasibility,  we  rleler  the  speculation  ol  many  details  which 
would  be  necessary  belore  the  architecture  could  be  realized. 
In  particular,  we  are  not  specifying  how  lo  implement  the 
architecture,  nor  are  we  specifying  the  Instruction  set  beyond 
what  we  absolutely  need.  So  as  not  lo  be  overly  distracted  by 
language  Issues,  we  have  chosen  FORTRAN  as  our  high-level 
language.  We  bellevo  that  the  necessary  extensions  for  other 
languages  would  be  no  more  difficult  on  our  architecture  than 
on  others,  and  thereloro  they  are  Irrelevant  to  the  current 
goals  of  the  research. 

2.  Basic  Concepts 

To  briefly  outline  the  thrust  of  the  architecture,  consider 
the  FORTRAN  statement 

X  =  (A  +  B)*C  +  (A  +  B) 


which  has  this  parse  tree; 


/\ 

/Ox 

»  n  a  I 


B 

/\ 

A  B 

Supposr  w-j  had  an  instruction  set  which  closely  mlmlced  this 
parse  '  ae  representation,  one  instruction  per  no.  Each 
inslructiun  might  be  a  triple 


IOPCOOE,  LEFT  PARI,  RIGHT  PART) 

where  left  pari  and  right  part  would  be  addresses  of 
instructions  which  calculate  the  operands.  The  execution  of 
an  instruction  would  consist  of  recursively  evaluating  the  teft- 
and  right-  parts  ol  the  instruction,  followed  by  the  application 
of  the  indicated  operation.  This  architecture  could  be 
implemented  using  two  stacks;  one  to  hold  intermediate 
computations  and  one  to  hold  partially-evaluated  instructions 
during  the  post-order  traversal  ol  the  parse  tree.  The  order  ol 
instruclions  in  memory  would  be  irrelevant  in  this  instruction 
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set-  -  the  control  flow  Is  specified  explicitly.  The  translation  of 

the  above  statement  would  be 

» . X . PI 

PI:  +.P2.P3 

P2 :  • , P4 ,C 

P3 :  + ,  A  ,B 

P4 :  +  ,A,B 

This  instruction  set  is  obviously  very  Inefficient,  but  it  can 
illustrate  two  points.  First,  because  the  Instructions  labeled  P3 
and  P4  are  identical,  there  is  no  reason  to  duplicate  them;  we 
can  eliminate  P4  and  change  P2  to 

P2:  * , P3 ,C . 

The  subexpressions  giving  rise  To  P2  and  P4  are  called,  in 
the  parlance  of  compilers,  formally  Identical  or  congruent, 

This  simply  means  that  they  are  identical  In  form~not 
necessarily  that  they  have  the  same  value.  It  is  both  simple 
and  efficient  to  detect  formal  identity  during  parsing,  and  doing 
so  at  compile  time  allows  us  to  represent  programs  more 
apace-efficiently  In  our  architecture.  By  contrast,  detecting 
common  subexpressions,  l,e„  formally  identical  expressions1 
that  also  are  guaranteed  to  have  the  same  value  at  execution 
-time,  is  not  as  simple  or  efficient.  Our  architecture  will  not 
require  the  compiler  to  do  this. 

Notice  that  even  though  the  expression  "A  +  B"  is 
;  represented  only  once  in  the  ob|ect  program  (using  the 
aforementioned  compaction),  it  is  actually  evaluated  twice  In 
,the  implied  traversal  of  the  parse  tree.  The  structure  of  the 
object  code  givos  us  the  possibility  of  avoiding  this 
recomputation.  Suppose  that  alter  completing  the  evaluation 
of  P3  (while  computing  the  left  part  of  PI)  wo  saved  the 
"value"  of  this  instruction  In  a  cache,  labeled  by  the  address 
P3.  If  we  checked  that  cache  before  evaluating  each 
instruction  operand,  we  could  retrieve  the  value  of  P3  when 
computing  the  right -part  of  PI  without  actually  recomputing 
it.  Suitable  care  would  have  to  be  taken  to  record  dependency 
information  In  the  cache  so  that  we  could  remove  the  value, 
should  either  A  or  B  change  in  the  future. 

Our  architecture  provides  such  a  cache,  which  is  the  major 
source  of  execution-time  efficiency.  The  effect  of  using  this 
cache  corresponds  closely  to  the  elimination  of  redundant 
expressions  by  optimizing  compilers.  In  fact,  this  technique 
may  be  superior,  because  it  can  eliminate  expressions  which 
are  redundant  under  the  particular  execution  history  of  the 
program.  Consider,  for  Instance,  the  following  FORTRAN 
statements: 

Y  ■  A+B 

IF  (Y  .IT.  0)  A  ■  A+l 
X  ■  A+B 

Because  the  two  occurences  of  "A  +  B"  are  formally  identical, 
they  can  be  computed  by  a  single  instruction  which  Is 
referenced  in  two  assignment  statements.  It  can  be  seen  that 
the  value  of  the  expression  A  +  B,  computed  In  the  first 
statement,  can  remain  In  the  cache  unless  the  assignment  to  A 
actually  takes  place  (invalidating  A  +  B),  The  same  mechanism 
serves  to  move  Invariant  expressions  out  ol  loops,  since  any 
expression  which  does  not  deperid  on  a  value  changed  in  the 
loop  will  remain  In  the  cache. 

This  simple  example  Illustrates  our  architectural  goal:  to 
provide  an  Instruction  set  which  preserves  the  structure  of  the 
parse  tree  in  a  way  that  permits  both  space-efficient 
representation  (by  having  only  one  copy  of  the  code  for 
formally  identical  expressions)  and  llme-elficient  execution  (by 
detecting  and  avoiding  the  re-evaluation  ol  expressions  whose 
value  has  not  changed). 

3rThe  Architecture 

We  now  introduce  the  architecture  and  Instruction  set 
currently  being  used  In  our  research.  We  would  like  to 
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emphasize  that  this  version  ol  the  architecture  Is  a  research] 
vehicle-one  intended  (only)  to  test  the  feasibility  of  the  ideas  j 
and  their  Impact  on  performance.  A  realistic  Implementation 
would  need  to  address  other  issues  and  would  require  careful 
tuning  and  elaboration  of  the  instruction  set. 


There  are  four  Important  parts  of  the  machine,  as  indicated  in : 
Figure  1: 


Memory  A  linear  vector  of  fixed-size  words,  indexed  i 

by  addreas.  1 

E  valuation  Stack  A  LIFO  stack  of  words,  used  to  hold 
Intermediate  values  during  computation, 
much  the  same  as  in  other  stack-ortenled 
machines. 

Control  Stack  A  UFO  stack  of  control  Information,  used  to 
control  the  recursive  descent  through  the 
parse  tree  graph. 

Value  Cache  An  associative  memory  used  to  save  the 
values  of  expressions. 


The  Control  Stack  and  the  Value  Cache  will  be  explained  In 
more  detail  later. 


Tug  Vatu* 


Figure  2:  Memory  wonilormit  I 

Every  word  In  memory  Is  a  one-operand  Instruction,! 
formatted  as  a  (tag,  value)  pair  (Figure  2).  Even  words  uauiMy  i 
thought  of  as  data  are,  In  this  machine,  instructions.  The  tao  1 
field  is  further  divided  into  a  number  of  subfields,  named  R,  x, 
i.  and  or.  op  is  the  operation  code  (e.g.  ADD),  and  n,  x,  and  t 
are  single-bit  fields  denoting  ftnturn,  Index,  and  Indirect. 
(These  will  be  described  later.)  The  actual  bitwise  packing  of 
these  fields  Into  a  word  Is  not  too  important,  but  lor 
concreteness,  we  think  ol  tag  as  being  B  bits  and  value  are 
being  (say)  24  bits,  This  would  give  us  a  5-bit  operation  code 
and  leave  24  bits  tor  data  or  an  address. 


3. 1  Instruction  Classes 

1 

The  instructions  are  divided  into  three  classes  according  to1 
how  their  operands  are  interpreted.  The  three  classes  are[ 
dale  Instructions,  address-operand  Instructions,  and  value-' 
operand  Instructions. 


Data  Instructions.  The  INT,  REAL,  and  ADOR  instructional 
correspond  to  the  three  data  types  recognized  by  this  simple  i 
version  of  the  architecture.  Executing  any  of  these 
instructions  causes  them  to  push  themselves  (value  and  tao) 
onto  the  Evaluation  Stack,  setting  R*1  and  x»i«0.  This 
contents  of  the  vaiue  field  in  data  Instructions  is  the  actual 
data  (l.e.,  in  INT  Instructions,  value  Is  the  Integer  datum,  ini 
RE  At  it  is  the  floating-point  representation,  and  In  ADDR| 
instructions  it  Is  an  address). 

The  data  instructions  are  quite  like  "tagged"  data  In  othsr  j 
HLL  architectures,  in  particular,  we  will  assume  automatic  j 


type  conversion  throughout- -there  will  not  be  separate 
instructions  for  floating-point  addition  and  integer  addition,  for 
instance. 

If  X  Is  a  variable  of  type  REAL  with  value  43.S,  the  name  X 
will  be  bound  to  the  address  of  a  word  containing  (the 
instruction) 

REAL  43. 5. 

The  reason  wo  make  data  words  executable  will  become  clear 
when  the  operand-fetching  mechanism  is  examined  later. 

Address  operand  instructions  These  instructions  Include 
I  MCI  (increment-by-ono),  INC  (general  increment),  StO  (store), 
and  the  twelve  conditional-jump  Instructions.  In  each  case,  the 
vai  ur.  field  is  Interpreted  as  nn  address,  and  this  address  Is  the 
instruction  operand.  The  semantics  of  the  instructions  are  as 
follows: 

StO  Removes  the  lop  word  from  the  Evaluation  Stack  and 
stores  It  at  the  operand  address.  1  he  R  field  Is  set  to 
1  in  the  stored  word,  arid  the  X  and  I  Helds  are  set  to 
0. 

INC  Removes  thn  top  word  from  the  Evaluation  Stack  and 
udds  It  to  the  word  at  the  operund  address. 

INCt  Increments  the  value  of  the  word  at  the  operund 
address  by  one. 

Jl  I  .  Jl.t  ,  Jtil ,  OOf ,  JIO,  JNI  Remove  the  lop  value 
Iroin  the  Evaluation  Stuck  and  branch  to  the  operand 
address  II  the  value  Is  less  than,  less  than  or  equul  to, 
grouter  tlmn,  greater  than  or  equul  to,  equal  to,  or  not 
equal  to  zero,  respectively. 

Value-operand  Instructions,  Those  instructions  include  RUSH 
and  llio  arithmetic  Instructions,  AIK),  Stilt,  Mill ,  and  l)IV.  For 
these  Instructions,  the  VALUE  Held  Is  again  Interpreted  us  an 
address,  hut  the  operand  is  obtained  by  evaluating  the 
address,  as  explained  below.  Otherwise  the  semantics  of  the 
Instruction  are  as  follows: 

PUSH  pushes  Its  operand  onto  the  Evaluation  Stack. 

Nf  Ci  negates  Its  operand  before  pushing  it. 

Alii)  removes  the  top  word  from  the  Evaluation  Stack  and 
adds  It  to  Its  operand,  leaving  the  sum  on  the 
Evaluation  Stack.  Type  conversions  are  performed,  If 
necessary,  according  to  stundard  FORTRAN 
conventions.  (Typo  Information  is  available  In  the  tag 
fields  of  tfie  dutu  on  the  Evaluation  Stack.) 

SUB,  MUl  ,  t)IV  work  like  ADD,  with  the  left-hand  argument 
being  on  the  stack  and  the  right-hand  argument  being 
the  operand  of  the  Instruction. 

Occasionally,  one  will  want  un  instruction  such  us  ADD  to 
take  both  Its  operands  from  the  stack.  We  therefore  adopt  the 
convention  that  II  vAiut:  =  0,  the  operand  normally  specified  In 
the  instruction  will  be  found  as  the  topmost  element  on  the 
Evaluation  Stack.  This  applies  to  both  address-operand  and 
value-cperund  Instructions, 

3.?  Operand  Evaluation 

As  stated  above,  ,/alue-operand  Instructions  obtain  their 
operands  by  evaluating  the  address  which  appears  In  the 
instruction.  In  this  architecture,  the  evaluation  mechanism 
uniformly  replaces  the  "fetch-the-contents-of"  mechanism  in 
traditional  architectures  To  evaluate  an  uddress  A,  the 
current  instruction-execution  state  Is  saved  on  the  Control 
Stack  und  execution  begins  at  A.  Alter  each  instruction 
completes,  the  r  bit  Is  examined;  If  n=  1,  tfie  Control  Slack  Is 
popped,  terminating  the  new  instruction  sequence  and 
returning  to  the  previous  one  st  the  point  where  it  was 
interrupted.  In  our  examples,  we  will  indicate  that  an 
instruction  has  n  =  1  by  appending  'V'  lo  the  operation  name. 


Strictly  speaking,  there  Is  no  restriction  on  what, 
Instructions  can  occur  in  the  new  instruction  sequence. 
However.  It  is  our  Intent  that  the  sequence  ol  instructions, 
which  Is  called  a  phrase,  will  leave  a  single  value  on  the 
Evaluation  Stuck.  If  we  make  the  further  assumption  that  the 
computation  Is  Independent  ol  data  alreudy  on  the  Evuluutlon 
Stack,  It  is  possible  to  speak  of  the  value  ol  A,  or  the  value  ol 
the  phrase  A. 

Note  that  a  singlo  data  instruction,  with  n  =  1 ,  sallsfios 
these  conditions  lor  u  phrase.  Hence,  a  single  dutn  word  may 
be  "fetched"  by  evaluating  (executing)  It. 

3.3  Indexing 

The  x  Held  in  provided  in  tags  to  perform  some  simple 
uddress  arithmetic.  When  *  =  1,  the  address  in  the  instruction 
is  first  Incremented  by  the  value  found  on  top  ol  the  Evaluation 
Slack  (which  is  removed  as  u  side  cllocl).  The  new  address 
becomes  the  operand  (lor  address-operand  Instructions)  or  the 
address  lo  tie  evaluated  to  obtain  the  operand  (lor  vulue- 
operund  instructions).  In  our  examples,  we  will  indicate  that 
x  =  1  In  an  instruction  by  appending  "x"  to  the  Instruction 
name,  as  in  "STOx  A". 

Occasionally  It  will  be  useful  to  obtain  an  indexed  address 
on  the  stack  without  evaluating  the  result.  We  therefore  allow 
the  x  field  to  be  set  In  the  ADDIt  Instruction,  In  which  case  the 
address  present  In  the  valuf.  field  ol  the  ADD  It  Instruction  Is 
incremented  by  the  value  on  top  of  the  Evaluation  Stack,  and 
the  resulting  address  Is  pushed  onto  the  stack. 

3.4  Indirection 

The  I  field  is  used  to  provide  an  extra  level  of  evaluation  In 
obtaining  operands.  When  1=1,  the  operund  obtained  by  the 
above  mechanisms  Is  evaluated  an  extra  time  to  obtain  the 
true  operand.  For  instance,  in  "SIOI  A",  the  uddross  A  Is 
evaluated,  and  the  actual  store  occurs  to  the  address  returned 
by  the  phrase  A.  In  "AUDI  A,"  the  uddress  A  Is  first  evaluated 
normally;  then  the  resulting  value  of  A  Is  evulualed,  yielding  the 
operand. 

This  mechanism  makes  several  assumptions.  In  particular, 
In  value-operand  instructions  It  Is  assumed  that  the  value 
returned  by  the  first  evaluation  Is  an  address  (so  Ihut  It  can  be 
evaluated  again),  Likewise,  In  uddross-operand  instructions  It 
Is  assumed  that  the  evaluation  (the  one  caused  by  I  =  1  is  the 
only  one)  produces  a  value  of  uddress  typo. 

When  I »  X  ■  1,  the  Indexing  operation  Is  applied  before  the 
(first)  evaluation. 

3.5  Discussion 

Returning  to  our  original  example,  we  can  see  what  the 
code  actually  looks  like  In  this  architecture. 

X  =  (A  ♦  B)*C  +  (A  +  B) 

PUSH  P3 

MUL  C 

ADD  P3 

STO  X 


P3 ; 

PUSH 

A 

ADDr 

B 

A; 

REALr 

23.6 

B; 

REAl.r 

-3.0 

C: 

REALr 

4.66E1 

X: 

REAl  r 

0.0 

Note  how  the  evaluation  mechanism  is  exploited  in 
collecting  the  formally  Identical  expressions  Into  a  single 
phrase  (P3), 

The  indexing  and  indirection  mechanisms  are  optimizations 
designed  to  facilitate  uddress  computations  in  array  and 
structure  accesses,  much  like  the  use  of  index  registers  In 
conventional  architectures.  In  (a),  below,  we  see  the  simplest 
form  of  indexing;  in  (b)  the  two  occurences  ol  “C(l)"  itave 
been  implemented  as  a  single  phrase;  in  (c)  the  phrase  has 
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been  constructed  to  compute  the  address  of  C(l)  since  both 
the  address  and  value  are  needed. 

C(l)  *  A ( J )  X  *  (CO)  ♦  B)'C(I)  CO)  *  CO)  ♦  B 


PUSH 

J 

PUSH  L 

PUSH!  L 

PUSHx 

A-l 

ADD  B 

ADD  B 

PUSH 

I 

MUL  L 

STOt  L 

STOx 

C-l 

STO  X 

... 

L :  PUSH  I 

PUSHxr  C-l 

Li  PUSH  I 

AODRxr  C-l 

(a) 

(b) 

<c) 

These  examples  indicate  that  there  Is  some  choice  In  how 
to  structure  the  ob|ect  code.  In  terms  of  space-efficiency,  any 
expression  appearing  In  the  source  program  more  than  once 
should  be  expanded  us  a  separate  phrase.  Execution-time 
efficiency  can  be  gained  by  additionally  separating  expressions 
used  within  a  loop;  if  their  value  does  not  change,  the  effect  Is 
the  same  as  It  the  compiler  had  moved  them  outside  the  loop. 

3.6  Tho  Value  Cache 

The  Value  Cache  is  the  most  unique  and  Important  part  of 
the  architecture.  Its  purpose  is  to  save  the  value  of  phrases. 
Every  time  an  evaluation  Is  attempted,  the  Value  Cache  Is  first 
checked  to  see  If  It  contains  the  phrase's  value;  It  found,  the 
value  can  be  Immediately  entered  on  the  Evaluation  Stack 
without  any  need  to  actually  execute  the  phrase  In  question.  If 
the  Value  Cache  does  not  contain  the  desired  value,  evaluation 
proceeds  normully  und  the  new  value  Is  copied  Into  tho  Value 
Cache  as  a  side-elfect  of  the  processing  of  the  H  field  in  the 
last  instruction  of  the  phrase. 

An  important  part  of  the  cachelng  mechanism  is  keeping 
track  ol  dependency  Information.  The  value  of  a  phrase  can 
depend  on  an  unbounded  set  of  memory  locations- -namely  all 
.those  which  are  referenced  In  the  course  of  Its  evaluation. 

'  Should  any  of  theso  locutions  be  changed,  the  old  vulue  in  the 
Valuo  Cache  must  be  purged. 

Because  tho  spuco  available  to  represent  dependency 
.information  In  the  cache  will  bo  limited,  we  must  huve  a  way  to 
encode  the  dependency  Information,  A  possible 
Implementation  is  to  represent  the  dependency  set  as  a  bit 
vector  ot  length  n.  A  dependency  on  a  particular  memory 
word  with  address  A  could  then  be  mapped  Into  one  of  the  n 
bits  by  an  operation  on  the  word's  address,  0(A).  An  inclusive 
"OR"  of  ull  encoded  addresses  would  then  represent  the 
dependencies  of  the  phrase.  Purging  from  the  cache  all 
values  dependent  on  address  B  could  be  accomplished  by 
eliminating  ull  entries  which  included  bit  D(B)  in  their 
dependency  mask. 

To  explain  how  the  Value  Cache  Is  used,  we  need  some 
information  about  both  the  Value  Cache  and  the  Control  Stack, 
The  Value  Cache  Is  an  associative  memory,  each  entry  of 
which  has  three  Helds: 

vc  address  address  of  phrase 

vc  value  value  of  phrase 

vC'Oependencv  dependency  of  phrase 

Control  Stack  entries  also  have  three  fields: 

p address  address  of  phrase 

istate  current  execution  state 

csde  pendency  accumulating  dependency 

There  are  four  activities  which  involve  the  evaluation 
mechaniam  and  the  Value  Cache: 

Beginning  an  evaluation,  The  Value  Cache  Is  checked  to  aee 
if  It  contains  the  phrase's  value;  If  so,  the  value  Is  immediately 
entered  onto  tire  Evaluation  Stack  and  the  evaluation  la 
considered  complete;  dependency  Information  from  the  Value 
Cache  (vc  dependency)  is  added  to  the  dependencies  being 


accumulated  lor  the  current  phrase  (cs  dependency).  If  the 
phrase  is  not  found,  the  current  execution  state  Is  saved  on 
the  Control  Stack  and  a  new  frame  is  added  for  the  new 
phrase,  whose  evaluation  begins,  csdependency  for  the  new 
phrase  is  Initially  null. 

During  evaluation.  Every  execution  of  a  data  Instruction 
represents  a  dependency;  the  dependency  Is  derived  from  the 
address  of  the  duta  instruction.  The  encoded  dependency  Is 
added  to  the  dependencies  already  recorded  in  cs-depENDENCy. 

After  evaluation.  When  an  Instruction  with  n-1  Is  completed, 
the  phrase  value  (the  top  value  on  the  Evaluation  Stack),  p. 
address,  and  cs  dependency  are  sent  to  the  Value  Cache  for 
recording  as  vc  valuf,  vc  address,  and  vc-dependency, 
respectively.  (If  the  Value  Cache  Is  full,  some  mechanism  for 
removing  entries  must  be  employed.)  The  Control  Stack  Is 
then  popped  to  return  to  the  previous  phrase;  the 
dependencies  of  the  completed  phrase  are  added  to  the 
dependencies  accumulating  'or  the  previous  phrase.  (That  is, 
II  phrase  A  Invokes  phrase  B,  phrase  A's  dependencies  include 
those  ol  phrase  B.) 

During  a  store  operation.  Whenever  a  STO,  INC,  or  INC! 
Instruction  is  executed,  every  Vulue  Cache  entry  which  shows 
a  dependency  on  the  altered  word  is  purged.  (This  may  not  be 
a  perfect  discrimination,  depending  on  the  encoding  D(X).) 
The  vulue  being  stored  (itself  u  phrase)  is  entered  Into  the 
Value  Cache  as  a  side-effect;  its  dependency  is  precisely  itself. 

As  an  example,  consider  the  following  (assume  M(6)»45): 

K  a  M(l)  ♦  I 

PUSH  L 

STO  X 

L:  PUSH  I 

PUSIIx  M-l 

ADDr  I 

t:  1  NT  i~  6 

K;  INfr  45 


There  are 

four  phrases  entered 

In  the  Value  Cache  after 

executing 

this  statement: 

vc  ADRESS 

VC  VALUE 

VC-DEPENDENCY 

1 

INT  6 

D(l) 

M  +  5 

INT  45 

D(M  +  5) 

L 

ADDR  M+6 

0(1)  V  D(M  +  5) 

K 

INT  51 

D(K) 

If  we  later  changed  the  value  of  I,  the  phrases  I  and  L  would 
be  purged  from  the  Value  Cache,  but  M(0)  (i.e.  M  +  6)  would 
remain,  unless  by  chance  D(l)»D(M  +  5). 


4.  Measurements 

To  obtain  objective  measures  of  the  performance  of  this 
architecture,  we  resent  here  analyses  ol  tour  simple  programs: 
three  production-quality  statistical  subroutines  taken  from  the 
Scientific  Subroutine  Package  and  one  simple  quadratic- 
oquatlon  solver  taken  from  an  introductory  programming  text. 
When  we  say  production-quality ,  we  mean  that  there  Is  no 
obvious  way  to  rewrite  the  source  program  more  efficiently  In 
the  statistical  subroutines.  In  contrast  to  this,  the  quadratic- . 
equation  program  contains  several  examples  of  formally 
identical  (and  redundant)  expressions. 

We  examined  the  execution  of  these  programs  on  three 
compiler/architecture  pairs:  on  our  architecture  with  a  simple 
compiler  performing  no  optimizations;  on  a  DEC  PDP-10  with 
the  FOR  TRAN- 10  optimizing  compiler;  and  on  a  modified  aleck 
architecture  (MSA).  The  MSA  is  a  variant  of  our  architecture, 
obtained  by  eliminating  the  evaluation  mechanism  (Including 
Value  Cache  and  Control  Stack)  in  favor  of  the  simple  'fetch- . 


the-contents-of"  mechanism;  it  is  thus  a  simple  stack 
architecture  with  the  3ame  one-operand  Instructions  as  in  our 
architecture.  The  compiler  lor  this  architecture  is  Identical  to 
the  one  for  our  principle  architecture. 

Code  size  statistics  were  obtained  from  listings  ot  the 
compiled  assembly  code.  Execution  statistics  were  obtained 
from  instruction  traces  on  the  PDP-10  and  Irom  emulators  ol 
the  other  architectures.  In  emulating  our  architecture,  we  used 
a  Value  Cache  with  100  entries  and  a  32-bit- wide  dependency 
field  with  0(A)  .  A  mod  32, 

In  comparing  program  sizes,  we  assume  that  a  "word"  Is 
equivalent  on  the  different  architectures.  Likewise,  execution 
statistics  are  expressed  os  the  number  ol  memory  letches  and 
stares  (Instructions  plus  data).  We  do  not  count  Internal 
processing,  so  all  instructions  take  unit  time  unless  they 
Involve  a  fetch  or  store  from  memory.  (We  do  not  consider  the 
Value  Cache  to  be  memory  In  this  sense,)  With  this  in  mind, 
we  present  the  data  in  Tables  1  and  2.  Tablet  3  and  4  present 
'I the  same  data  as  a  fraction  of  the  MSA  values. 


'  Prooram 

PDP-10 

Architecture 

Ours 

MSA 

•isi 

186 

211 

224 

!)S2 

148 

188 

186 

383 

80 

94 

98 

lS4 

121 

118 

189 

TabU  1 :  Coda  ait*  (word*) 

and  execution.  However,  even  with  well-coded  programs,  we 
see  a  significant  Improvement  over  a  simple  stack  architecture. 

Of  course,  these  lew  examples  cannot  alone  establish  the 
benefits  of  our  architecture.  It  is  meant  only  as  an  informal 
argument  to  establish  the  possibility  of  such  benefits,  even  in 
programs  not  easily  optimized.  We  hope  to  provide  more 
quantitative  evidence  on  a  wider  range  of  programs  in  the* 
future,  along  with  more  Information  on  the  effect  of  the  size  of 
the  Value  Cache  and  on  cache-full  policies  [3], 
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•ISI 

2,182 

2,414 

3,847 

S2 

1,282 

1,726 

2,219 

S3 

8,618 

9,886 

12,942 

S4 

408 

447 

824 

Tabls  2:  Execution  ipeod  (letches) 
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Prooram 

PDP-10 

Ours 

SI 

.69 

.88 

S2 

.68 

.78 

S3 

.60 

.76 

S4 

.60 

.64 

Tablt  4:  Execution  xpeed  (fraction  of  MSA) 

The  PDP-10  and  MSA  are  in  a  sense  upper  and  lower 
bounds  for  comparison  purposes.  The  PDP-10  Is  a  mature 
instruction  set  In  the  traditional  Von  Neumann  mold;  It  has 
been  carefully  designed  and  optimized.  MSA  on  the  other 
hand  is  the  simplest  stack  machine  one  can  iinagine.  Likewise 
the  PDP-10  Incorporates  a  sophisticated  compiler,  whereas  the 
other  architectures  have  very  simple  compilers,  (In  particular, 
they  do  not  even  have  to  do  register  allocation.) 

The  data  confirms  that  the  PDP-10  is  still  the  more  highly 
optlmtzed  architecture,  but  in  the  case  of  the  S4  program,  our 
simple  compHer  was  able  to  produce  code  which  was  more 
compact  and  which  executed  almost  as  quickly.  Clearly  the 
benefits  depend  to  some  extent  on  the  degree  to  which 
redundant  expressions  can  be  eliminated  during  compilation 
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Aba tract 

A  mechanism  for  supporting  fine-grain 
program  protection  and  abstraction  in  a  multi  - 
oomputar  context  is  described.  It  is  argued  that 
much  features  are  necessary  to  support  high  level 
user  interfaces  and  particularly  high  level  lan¬ 
guage  isplemsntations  using  microprogram  control, 
that  their  cost  must  be  ssmll  in  relation  to 
liLcroinatructions.  The  mechanism  is  currently 
being  investigated  by  simulation  techniques  as 
part  of  a  general-pur poee  system  study. 

Objectives 

The  most  important  abjective  of  general- 
purpose  oo sputa r  design  is  to  model  acourately, 
reliably  and  afficiantly  tha  data  of  wiAsly  varying 
problem  domains .  Ha  might  instance  records, 
massages,  tax  tablaa  or  graphiaal  images  as  typic¬ 
al  classes  of  data  familiar  to  computer  users,  and 
to  tha  extent  that  tha  attributes  of  e  class, 
neither  more  nor  lese,  are  recognised  we  can  say 
that  a  successful  abetraoticn  has  been  achieved. 

Ns  define  a  'high  level'  architecture  as  one  that 
supports  such  abstractions  for  an  opan-endad  list 
of  classes.  Its  isportance  ia  that  it  enables  com¬ 
plex  data  processing  applications  to  be  developed 
and  maintained  in  a  reliable  state  by  offering  to 
information  engineers  something  comparable  with 
the  subassemblies  and  precise  tolerances  of,  say, 
mechanical  design.  Overall,  one  expects  as  a 
result  to  produce  better  systems  more  quickly  and 
sore  reliably  and  at  a  lower  coet  than  would  other¬ 
wise  be  possible. 

The  complexities  of  operating  aystama 
have  drawn  attention  to  the  importance  of  program 
structure,  most  designers  making  use  of  the  ideas 
of  task  (i.e.  prooass),  file,  segment,  «vtnt  and 
others  in  abstract  form.  Ne  oould  include  oode 
segment  in  the  list  end  thus  lead  to  the  accurate, 
reliable  and  efficient  modelling  of  high  level  lan¬ 
guages,  but  it  would  be  a  mistake  in  the  present, 
oontsxt  to  put  either  operating  system  or  language 
engineers  in  positions  of  privilege  since  (through 
no  fault  of  their  own)  that  seems  to  guarantee  poor 
response  to  user  requirements.  For  example,  in 
range-defined  architecture  (in  the  style  of  the 
IBM  360)  the  micro  programmer  has  in  effect  been  a 


language  engineer  with  considerable  privilege i 
for  precisely  that  reason  it  has  been  inpractical  j 
to  make  wide  use  of  improvements  in  tha  encoding 
of  high  level  languages  which  depend  on  having  i 
variable  intermediate  oode  formate.  Attempts  to 
define  architectures  at  even  higher  level  run 
correspondingly  higher  risks. 

The  order  of  events,  therefore,  is  to 
define  the  abstraction  mechanism  first  and  then 
use  it  to  model  whatever  operational  behaviour  ie 
required.  But  whet  is  meant  by  doing  that 
'efficiently1?  Fifteen  years  ego,  vndar  tha 
umbrella  provided  by  the  IBM  360,  it  seamed  suffi¬ 
cient  to  achieve  the  objective  with  'no  increase 
in  program  else  or  lose  of  speed' ,  which  is  essen¬ 
tial  It  what  happened  with  the  Basic  Language 
Machine1.  Today  that  umbrella  is  permeable  end  toj 
out-perform  current  range-defined  architectures  is  I 
coimeonplaca.  The  essential  requirement  now  seams  f 
to  be  to  provide  the  benefits  of  abstraction  at  | 
the  finest  level  of  description  used  by  systam, 
language  or  application  engineers  -  in  other  wordr 
at  what  ie  usually  regarded  es  the  microcode  level  j 
Once  that  is  done,  the  way  is  open  to  realising  in 
a  practical  oontsxt  the  advantages  of  microcoding 
that  have  often  been  demonstrated  under  special 
conditions. 


In  this  paper  I  shall  outline  a  design, 
which  for  reasons  soon  to  become  clear  is  called 
a  “Pointer-Number  system",  which  demonstrate*  one 
way  of  meeting  the  objectives.  It  takes  account 
of  systam  requirements  not  mentioned  here,  and  haa 
been  carried  to  a  detailed  simulation  in  order  to  , 
make  realistic  performance  estimates.  In  the  next | 
subsection  we  review  the  techniques  on  which  it  is  i 
baaed  and  the  range  of  problems  that  have  to  be  j 
solved  et  the  next  stage  of  design.  The  following- 
subsections  outline  respectively  tha  'PM  Machine '  j 
and  'SN  System'.  Finally,  some  conclusions  see  I 
drawn  from  the  experimental  work  done  so  far.  The  I 
reader  is  referred  to  the  PN  System  Manual1  for  j 
more  detailed  explanation  and  justification.  i 

i 

Abstraction  Mechanisms  ; 

The  basic  requirement  is  to  mechanise  the  ' 
ideas  that  might  be  expressed  asi  “Let  A  be  a 
class  of  objects  with  attributes  {a.}  i-  0  . .  L* , 
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"Let  x  be  e  (merioer  of  the  class)  A ",  "Let  y 
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dsnote  (the  same  member  of  the  came  class  aa)  a", 
and  eo  on,  ell  the  representations  being  within  the 
limits  of  a  finite  computer  store,  in  programming 
terms  this  quickly  resolves  into  the  use  of  des¬ 
criptors  or  pointers  as  a  type  of  operand  distinct 
from  the  attribute  seta  that  represent  the  individ¬ 
ual  objects,  a  construct  that  has  been  used  from 
the  earliest  days,  though  it  was  not  precisely  en¬ 
gineered  until  segmented  storage  cams  into  use  in 
the  early  1960's  (Figure  1).  in  the  case  of  pro¬ 
gram  space  the  connection  between  (Indexed)  pointer 
and  attribute  is  notionally  direct,  but  it  is  a 
simple  extension  of  the  same  idea  to  interpret  the 
descriptor  as  referring  to  a  member  of  any  given 
class  of  objects,  which  was  the  generalisation  made 
in  the  Basic  Language  Machine  (Figure  2) .  In  the 
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T:  Segment  type 
L:  Maximum  index 
F;  Location 

Figure  It  Storage  segment 


latter  case  the  pointer  contains  indices  a,  id  that 
uniquely  identify  the  class  and  object  in  question. 
In  accordance  with  current  practice  we  refer  to 
pointers  used  in  this  indirect  way  as  "capabilities" 
but  the  term  "codeword"  is  retained  for  the  special 
case  of  reference  to  storage. 

It  is  implicit  that  pointers  cannot  be 
forged,  otherwise  the  whole  point  of  having  precise¬ 
ly  engineered  program  structures  is  lost.  On  the 
other  hand  they  must  be  manufactured  somewhere  and 
the  class  manager  must  be  able  to  manipulate  the 
representations  directly.  Such  considerations  lead 
to  the  notion  of  protected  domains  characterised  by 
sets  of  pointers  that  define  the  'rights'  of  a  pro¬ 
gram  at  any  instant  of  its  execution.  As  oontrol 
flows  from  one  domain  to  another  there  must  be 
cor  respecting  changes  in  the  list  of  rights. 

Before  discussing  possible  mechanisations 
we  should  be  aware  of  the  performance  parameters  to 
look  for  in  the  final  analysis.  Amongst  the  most 
important  is  the  time  taken  to  access  the  attribute 
given  a  valid  pointeri  there  is  no  absolute  figure 
to  aim  for,  but  it  is  required  to  be  short.  in  rela¬ 
tion  to  the  class  of  operations  that  it  supports. 

For  example ,  in  dealing  with  files  or  tasks  the 


individual  operations  are  fairly  substantial  and  a 
number  of  capability  systems  have  been  implemented' 
in  which  pointers  are  interpreted  by  the  operating 
system  without  serious  loss  of  speed.  In  moving 
towards  sispler  operations  the  interpretive  mecha¬ 
nism  must  be  refined  and  assisted,  first  by  micro¬ 
program  and  finally  by  hardware ,  and  in  the  present 
context  the  stringent  requirement  of  having  low 
overhead  in  relation  to  micro-operations  forces  us 
to  disrtgard  all  but  the  moat  delicate  controls. 

In  the  model  provided  by  Figure  1  we  might  nominate 
the  'effective  storage  access  time'  as  the  relevant 
parameter.  In  Figure  2  the  critical  time  is  that 
taken  to  move  the  locus  of  control  from  the  'user 
domain',  containing  the  capability,  to  the  'class 
manager  domain'  in  which  interpretation  takas  place 
and  back  again.  In  either  case,  if  the  observed 
cost  is  too  high  users  will  tend  to  avoid  the 
facility  and  lose  its  bensfits. 
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Figure  2:  Indireot  oIobb  representation 


the  other  factors  are  more  difficult  to 
quantify  because  they  entail  the  Inevitable  com¬ 
promise  between  coat  of  management  and  ease  of  use. 
It  might  be  aakedi  "If  members  of  a  class  are 
generated  at  a  given  rate,  what  is  the  resulting 
management  overhead?".  For  example,  how  often  can 
one  open  new  files,  create  messages,  or  assign  new 
tasks  without  undue  penalty?  Clearly,  some  costs 
are  passed  on  to  storage  management  which  has  to 
provide  file  control  blocks,  buffers,  task  vectors 
and  so  on,  but  there  remains  the  responsibility 
for  master  object  tables,  for  recovering  'dead' 
identifiers,  and  for  error  management.  The  tech¬ 
niques  available  for  reducing  costs  arc  mainly  con¬ 
cerned  with  the  time  taken  to  scan  the  program 
space  looking  for  particular  classes  of  pointer 
and  might  be  aimed  at  eliminating  that  need 
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entirely,  e.g.  byi 

(a)  enlarging  tha  master  object  tables  to  service 
all  foreseeable  requests;  or 

(b)  restricting  the  use  of  pointers,  e.g.  by 

.  indirect  reference  through  system  tables  or 
by  linguistic  devices; 

alternatively  we  can  seek  to  minimise  the  actual 
scanning  time  by: 

(c)  limiting  the  extent  of  pointer-bearing 
segments;  or 

(d)  constraining  the  program  structure,  e.g.  to 
separate  task  domains  or  to  a  'tree'  form. 

In  any  well-designed  capability  system  the  con¬ 
straints  are  small  in  relation  to  the  benefits  they 
bring,  but  the  fact  remains  they  are  a  psycho¬ 
logical  hindrance  to  widespread  acceptance.  The 
best  way  round  that,  architecturally  speaking,  is 
by  i 

(e)  providing  high  speed  memory  scanning  and  up¬ 
dating  operations,  enabling  many  of  the 
restrictions  to  be  relaxed. 

the  last  solution  is  pursued  in  the  PN  system  by 
using  what  are  effectively  microprogrammed  manage¬ 
ment  procedures  in  conjunction  with  hard-wired 
'planar'  memory  scanning  functions. 

Returning  to  the  primary  measure  of 
storage  access  rate,  it  is  clear  that  no  scheme 
dependent  on  validating  pbnters  at  time  of  use 
(against  access  list,  segment  table,  capability 
registers,  etc)  would  be  acceptable,  and  in  order 
bo  compete  with  'unrestricted'  access  mechanisms 
we  are  forced  (i)  to  admit  pointers  as  operands 
used  directly  by  machine  instructions;  and  (ii)  to 
control  their  formation  so  as  to  preserve  the 
integrity  of  programs.  There  still  seems  to  be  no 
better  way  of  doing  that  than  by  using  a  tagged 
register  format.  However,  in  moving  tha  control 
mechanism  to  microinstruction  level  the  interpre¬ 
tation  of  tags  must  be  resolved  in  single  micro- 
ordsrs.  In  theory,  just  one  tag  bit  is  necessary, 
to  distinguish  bstwesn  pointers  and  numbers,  but  it 
will  be  seen  In  the  next  subsection  that  fifteen 
pointers  and  one  form  of  number  are  distinguished 
by  a  four-bit  tag  code. 

We  have  already  seen  that  because  of  Its 
practical  importance  storage  is  distinguished  from 
all  other  abstract  classes.  A  further  distinction 
Is  drawn  between  sharable  ( global )  and  unsharable 
(local)  data  areas.  The  corresponding  pointers  are 
codewords  and  addreeaea  respectively,  which  have 
almost  ldantical  properties  in  normal  use.  It  is 
unfort’inats  to  Mka  the  distinction,  but  it  re- 
flscts  tha  fact  that  controlled  access  to  shared 
resources  usas  a  single  level  of  indirection  which 
Is  otherwise  unnecessary.  The  same  mechanism  is 
used  to  distinguish  between  data  that  might  be  at 
a  remote  site  in  a  multicomputer  system  (and  there¬ 
fore  'global')  and  data  areas  that  are  strictly 
local . 


Figure  3  illustrates  the  use  of  pointers 
in  referring  to  different  program  workspaces.  The 
transformation  a  is  handled  by  capability  managers, 
while  B  is  the  responsibility  of  the  segment  mana¬ 
ger.  Parallels  can  be  drawn  between  writing  in  a 
conventional  high  level  programing  language  and 
operating  on  global  data,  between  microprogramming 
and  working  at  local  level.  However,  a  key  feature 
of  the  PN  system  is  that  sharp  distinctions  are  not 
drawn  and  it  is  easy  to  move  from  one  level  to  the 
next. 
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Hgwre  2:  Levels  of  program  a  pace 


A  protection  domain  is  defined  by  the  com¬ 
bined  effect  of  two  sets  of  rules:  those  that 
govern  the  inheritance  of  access  rights  in  regis¬ 
ters  and  storage,  formation  of  new  addresses  from 
old  ones,  restriction  of  access  options  In  capa¬ 
bilities,  etc,  all  of  which  are  reductive  in  char¬ 
acter;  and  those  concerned  with  the  expansion  of 
rights  in  passing  from  one  domain  to  another.  The 
ability  to  expand  rights  depends  on  some  prior 
authority  saying  in  advance  that  "program  module  M 
shall  only  access  resources  ffl^,  m^,  ...  ,  which 

in  turn  devolves  on  the  construction  of  control 
segments  and  associated  data.  Apart  from  the  need 
for  speed  and  flexibility  in  implementing  such  s 
rule  we  slso  require  that  it  should  be  easy  to 
apply  and  not  expensive  to  support.  In  the  PN 
•ystem  the  region  into  which  rights  expand  is  de¬ 
fined  by  a  set  of  resources  known  as  a  base.  There 
will  be  several  bases  in  a  system,  so  there  is 
scope  for  partitioning  at  that  level.  The  objects 
m2’  • • •  mk  a£e  identified  by  indices  that  are 
embedded  in  object  code.  That  seems  to  be  the  most 
economical  way  of  changing  access  lists,  since  it 
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is  done  at  zero  cost  in  conjunction  with  control 
transfers.  It  will  be  shown  later  how  the  inter¬ 
connection  mechanism  is  supported  by  machine  func¬ 
tions  in  the  context  of  a  dynamically  changing 
base,  task  and  module  population. 


Pointer-Number  Machines 

in  order  to  evaluate  the  above  ideas  in  a 
practical  system  context  a  detailed  machine  model 
known  as  "microPN"  has  been  defined  and  simulated. 
The  intention  has  been  to  provide  full  support  for 
abstraction  in  the  context  of  an  assembly  of 
processor-memory  pairs,  each  comparable  in  cost 
and  speed  with  current  microprogrammable  machines. 

The  main  components  of  microPN  are  shown 
in  Figure  4.  itie  register  file  (X)  consists  of  16 
32-bit  general-purpose  registers.  Most  internal 
machine  operations  can  be  completed  in  one  or  two 
ALU  cycles,  typically  processing  the  'high1  halves 
of  the  operands  first,  which  includes  checking 
their  tags,  followed  by  the  'low'  portions,  The 
ALU  carries  out  elementary  arithmetic,  logic  and 
shift  operations  on  numeric  words,  and  the  special 
operations  required  in  controlled  pointer  forma¬ 
tion. 
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The  sequencs  controller  plays  a  conven¬ 
tional  role.  The  most  frequently  used  control 
fields  (control  pointer,  condition  codes)  are  held 
as  separate  registers,  the  remainder  being  found 
in  the  general  register  file  and  protected  from 
mis-use  by  overall  controls  on  program  construe-  . 
tion.  Ttiey  include  base  and  task  indices,  stack 
base  and  current  stack  frame,  current  control  seg¬ 
ment  index. 

The  local  memory  controller  serves  requests 
for  data  and  Instruction  accesses  within  the  pro¬ 
cessor  and  external  requests  arriving  via  the 
global  memory  controller.  The  memory  operations 
include  normal  fetch  and  store  of  byte,  word  and 
tagged  values,  and  'planar'  accesses  arising  from 
the  use  of  local  memory  as  an  active  storage 
device. 


Ihe  four  high  order  bits  of  each  register 
conattn  a  tag,  as  shown  in  Table  1.  The  remaining 
26  bits  are  interpreted  accordingly.  The  format 
of  tagged  elements  in  store  is  the  sair«s  as  for 
registers.  Note  that  tags  0..7  are  'global',  and 
have  the  same  meaning  for  every  machine  in  an 
assembly,  while  tags  B..f  are  addresses  with  no 
meaning  outside  the  processor  in  which  they  occur. 
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It  can  be  seen  from  Table  1  that  capabil¬ 
ities  and  codewords  have  'arithmetic*  and  1  non- 
arithmetic  *  format  in  the  former  the  object  index 
or  identifier  can  be  altered  by  arithmetic  opera¬ 
tions.  In  neither  cate  can  the  class  or  segment 
index  be  changed  without  authority.  A  distinction 
can  thus  be  drawn  between  a  'singular*  reference  to 
an  object  or  element  of  a  segment  and  one  that  can 
be  treated  as  one  of  a  sequence. 

Local  objects  are  the  addresses  in  local 
memory  (starting  at  byte  position  F  or  plane  P )  of 
L+l  consecutive  elements  of  the  specified  type. 

The  local  store  is  extended  by  an  optional  planar 
store  which  serves  as  a  back-up  for  the  (presumed) 
faster  local  memory.  In  microPN  planes  are  just 
256  bits  in  size,  and  to  enjoy  the  full  advantage 
of  the  addresssing  scheme  it  Is  envisaged  that 
planes  of  1024  or  4096  bits  will  be  used  in  practice. 
Data  ia  transferred  between  levels  via  the  planar 
register  unit. 

Global  sagments  are  addressed  indirectly 
by  the  global  memory  controller  through  a  segment 
table  which  might  be  associated  with  another  micro¬ 
PN  processor  in  the  same  assembly.  Segment  table 
entries  have  the  same  form  as  addresses.  Figure  5 
shows  the  principle  of  interprocessor  communication 
assuming  a  bi-directional  data  and  address  bus  of 
32  bits.  The  requesting  program  applies  a  memory 
function  m  to  the  codeword  (e,i)  .  From  o  the 
position  of  the  'host*  is  found*  if  not  in  the  same 
processor-memory  pair  the  parameters  are 

transmitted  to  the  receiving  module,  where  the 
function  m  is  intsrprsted  with  rsference  to  its 
segmsnt  table.  A  suitable  reply  is  sent  to  the 
requesting  program.  Details  of  the  interaction 
depend  on  performance  objectives  and  cannot  be 
meaningfully  examined  until  program  design  strate¬ 
gies  have  been  fully  explored. 
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The  'plane  ALU*  operates  on  three  planar 
registers,  each  256  bits  in  microPN i  an  accumulator 
which  can  bo  regarded  as  16  words  of  16  bits  or  one 
bit  from  each  of  256  words  stored  in  plane  sequence ) 
a  carry  plane  associated  with  the  accumulator  for 
bit-serial  operations*  and  an  activity  plane  that 
selectively  controls  store  write  operations.  A 
further  set  of  operations  is  provided  to  move  the 
accumulator  in  either  'row*  or  'column'  direction, 
with  linear  or  cyclic  edge  connections.  The  planar 
functions  are  designed  primarily  to  assist  in  high 
speed  operations  on  numerical  data,  digitised 
images,  signal  data,  etc.  However,  in  the  present 
context  planes  play  a  prominent  part  as  32-byte 
units  of  memory  allocation,  and  planar  functions 
are  used  in  module  interconnection  and  scanning 
operations.  The  conventional  store  operations  are 
extended  to  transmit  numeric  data  between  general 
purpose  registers  and  word  planes  along  common  row 
or  column  data  lines.  Hence  the  design  achieves 
another  fundamental  objective,  of  easy  transition 
between  'parallel*  and  'scalar'  modes  of  operation. 

In  a  tagged  machine  the  l.netruction  set  ie 
designed  to  carry  out  normal  arithmetic  and  logical 
functions  on  numeric  data  and  to  provide  separate 
functions  for  operating  on  pointers.  Thus  the 
'modify'  function  in  various  forms  applies  to  any 
address  and  increases  F  (or  P)  by  a  given  amount, 
decreasing  L  accordingly.  The  'limit'  operations 
reset  £  to  a  lower  value.  If  the  bounds  of  the 
original  sequence  are  exceeded  an  'invalid  address' 
(system  capability  class  8,  see  below)  is  returned. 
In  that  way  the  current  protection  domain  can  be 
delineated  with  a  precision  of  one  byte. 

In  microPN  there  are  eight  primary  func¬ 
tion  groups,  of  which  four  are  tag-independant  and 
four  restrict  the  tag  of  one  or  two  general-purpose 
registers.  The  tag  limitations  can  be  simply  ex¬ 
pressed  in  tabular  form  and  as  far  as  con  be  seen 
would  have  very  little  effect  on  cost  or  spaed. 
Nevertheless  the  essential  protection  mechanisms 
have  been  retained. 

An  incidental  effect  of  the  PN  protection 
scheme  is  to  make  it  easy  to  apply  'execute-only' 
options  to  control  segments.  Advantage  has  been 
taken  of  that  to  preserve  some  engineering  flexi¬ 
bility  and  to  undertake  some  security  checks  during 
program  translation.  For  exanple,  all  register, 
base,  label  and  system  function  Indices  are  checked 
by  the  compiler  and  written  into  code  sequences 
knowing  that  they  cannot  be  changed  by  the  user. 
Similarly,  privileged  function  codes  (such  as  'set 
tug')  can  be  generated  without  direct  control  by 
the  programmer  and  there  is  no  need  for  a  diatinct 
'microsystem  stats'.  There  is,  of  course,  the 
possibility  of  code  being  corrupted  by  store  mal¬ 
function  which,  like  pointer  errors,  could  lead  to 
wider  breakdown.  Whether  to  control  such  errors  by 
further  checks  on  the  code ,  the  pointers,  the  task 
space,  the  processor,  ...  or  at  sods  other  boundary 
depends  on  the  type  of  reliability  and  availability 
that  is  demanded. 
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The  pn  system  supports  ton  classes  of 
abstract  objects,  see  Table  2.  The  aim  of  each  ab¬ 
straction  is  to  disclose  as  much  about  each  class 
as  the  user  needs  to  know  in  order  to  operate  on  it 
efficiently,  concealing  attributes  that  are  irrel¬ 
evant  or  liable  to  change.  For  example,  binary  in¬ 
struction  formats  are  concealed  in  the  definition 
of  oontrol  segments  in  order  to  allow  freedom  to 
change  the  instruction  coding.  The  system  abstract 
objects  constitute  the  resources  available  for  pro¬ 
gram  construction  at  the  lowest  design  level.  To 
reach  the  level  of  facility  normally  seen  by  appli¬ 
cation  or  system  programmers  new  classes  of  object 
such  as  'message'  or 'queue'  will  be  implemented  in 
terms  of  those  that  already  exist.  The  use  of 
separate  tag  codes  for  'system'  and  'user'  capabil¬ 
ities,  while  not  strictly  necessary,  is  helpful  in 
defining  system  structure, 


TABLE  2 

PN 

System  capabilities 

(All  elements  in  this  group  have  tag  4, 
the  index  value  id  identifies  a  member 
of  the  alass  a) 

at  0 

Null 

1 

Control  segment 

2 

Pointer  segment 

3 

Base 

4 

Task 

6 

File 

6 

Host  (Prooessor-memory  pair ) 

7 

CFC  (see  text) 

8 

Function  error 

16 

Numeric  segment 

The  principles  of  capability  management 
are  widely  understood,  so  we  examine  here  only 
aspects  peculiar  to  the  PN  system. 

Data  segments 

An  important  distinction  is  drawn  between 
data  segments  (identified  by  numeric  or  pointer 
capabilities)  and  access  paths  to  them  (identified 
by  codewords) .  A  given  segment  may  be  accessible 
through  0 , 1  or  more  such  paths  at  a  time ,  each 
using  a  distinct  index.  Their  allocation  is  con¬ 
trolled  by  system  functions  to  facilitate  data 
sharing  at  global  level.  The  distinction  is  impor¬ 
tant  because  not  all  operations  on  segments  demand 
access  to  individual  elements:  for  example,  one 
might  want  to  know  the  type  or  size,  position  in 
the  hierarchy,  or  simply  to  pass  the  segment  capa¬ 
bility  as  a  parameter. 


Control  segmerts  ' 

In  the  same  way,  control  segment  capabil¬ 
ities  are  distinguished  from  control  pointers  (tag 
1  or  5) .  A  oontrol  segment  contains  encoued  in¬ 
struct!  ons  and  data  derived  from  definitions  given 
in  the  system  programming  language.  Although  many 
features  of  the  PN  machine  are  abstracted  the  seg¬ 
ment  size,  which  contributes  to  channel  loading  and 
working  sat  requirements,  ia  not:  in  microPN  the 
maximum  size  is  4096  bytes.  There  is  only  weak 
connection  between  segments  and  control  flow,  i.e, 
change  of  segment  does  not  inply  change  of  proce¬ 
dure,  nor  vice  versa,  the  reason  being  that  although 
one  can  sometimes  take  advantage  of  such  conventions 
it  is  usually  undesirable  to  couple  logical  control 
structure  to  physical  store  assignment. 

The  definition  of  control  segments  in¬ 
cludes  a  precise  specification  of  the  registers 
they  use,  their  entry  points,  and  external  connec¬ 
tions  that  may  be  established  with  reference  to  the 
environment  at  time  of  use.  The  compiler,  in  con¬ 
junction  with  machine  functions,  ensures  that  the 
bounds  so  defined  are  strictly  observed.  That  is 
the  essential  requirement  of  software  engineering, 
brought  down  to  'micromachine '  level.  A  logical 
property  of  a  control  segment  (Figure  6)  is  that 
the  only  resources  it  can  use  are  those  defined  in 
or  accessible  from  the  registers  at  a  point  of 
entry  (e^,  e^ ,  or  in  Fig. 6)  ,  or  those  acquired 
by  expansion  (m^  or  mg) ,  or  those  that  it  creates 
by  using  one  of  the  resource  managers. 
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Figure  6:  Interconnection  of  oontrol  segments 


It  is  theoretically  attractive  to  have 
precise  control  over  which  of  the  entry  points  to 
a  module  can  be  used  in  a  given  context.  For 
example,  if  M  controlled  a  class  of  queues  and  e ^ 

Bj  and  e^  allowed  users  to  'join',  to  'leave',  and 

to  'delete'  a  specified  queue,  it  might  be  desir¬ 
able  to  withhold  eg  from  all  but  a  limited  subset 

of  users.  That  would  mean  having  distinct  pointers 
for  each  entry  point  and  increased  overhoads  in  the 
management  of  bases.  On  balance,  it  is  preferable 
to  define  inly  a  single  codeword  for  the  module, 
say  M,  and  to  enumerate  the  entry  pointers  as  W, 

M+l ,  and  M+2,  corresponding  to  e^,  e j  and  e g  in  the 
example.  More  precise  control  can  be  achieved  by 
(a)  using  separate  control  segments  for  'join'  and 
'leave'  on  one  hand  and  'delete'  on  the  otheri 
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(b)  by  using  part  of  the  Identifier  field  to  encode 
tfie  permissible  operations  (the  'access  options'  ir. 
Fig. 2) f  or  (c)  by  controlling  the  indexing  opera¬ 
tions  in  a  higher  level  language. 

Once  formed,  a  control  segment  is  ready 
for  execution.  There  is  no  need  to  „oad  or  consol¬ 
idate  it  into  a  particular  program,  task  or  pro-' 
cessor  space.  The  reason  for  that  design  decision 
is  that  it  qives  the  greatest  flexibility  in  pro¬ 
gram  construction  at  a  cost  which,  from  experience 
of  similar  systems,  appears  to  be  small.  External 
connections  are  defined  by  reference  to  the  current 
bars  and  task,  but  since  the  sene  segment  might  be 
in  concurrent  execution  with  reference  to  several 
different  baeee  and  tasks,  each  with  different  com¬ 
ponents,  the  environmental  vectors  are  treated  as 
'sparse'  and  connection  it  made  by  an  associative 
search  using  the  resource  name  as  argument.  The 
association  is  dona  by  parallel  (planar)  operations 
and  is  relatively  fast. 

The  only  method  of  expanding  rights  is  via 
the  list  of  resource  names,  and  strictly  speaking 
the  inclusion  of  a  name  in  a  control  segment  should 
be  subject  to  formal  checks.  It  would  be  possible 
to  give  a  list  of  'valid'  names  to  each  user  or 
software  design  group,  but  here  again  the  advantage 
gained  from  a  strict  rule  of  construction  must  be 
balanced  against  the  cost  of  administering  it.  In 
our  experience  informal  controls  are  sufficient  for 
most  applications,  wherein  the  'prior  authority' 
can  verify  by  inspection  of  the  source  code  that  a 
control  module  (such  as  M)  cannot  extend  its  effect 
beyond  the  permitted  bounds  (euch  as  w.  and  trig) . 

Function  errors 


For  any  machine  or  system  function  con¬ 
strued  as  'failinq'  there  is  a  choice  of  aborting 
the  task  or  returning  a  racogi  isably  invalid  result 
from  system  capability  class  S.  The  choice  ia  a 
practical  matter i  for  axarple,  illegal  tagt  abort 
the  program,  whereas  address  overflow  returns  an 
invalid  address.  If  the  former  option  is  taken  the 
'result'  of  a  task  Is  itself  a  class  8  capability. 
In  all  cases  the  encoding  of  the  index  field  gives 
the  function  type  and  reason  for  failure. 

A  similar  convention  can  be  applied  in 
the  user  domain,  returning  class  0  system  capabil¬ 
ities  ('Null')  to  indicate  failure.  With  regard 
to  dynamic  type  checking,  the  user  can  easily 
'break  open*  a  capability  to  examine  its  class  and 
tag  fields.  There  are  three  courses  of  action: 

(a)  to  assume  all  types  are  correct  and  expect 
to  fail  later  (e.g.  on  tagcheck)  if  they 
are  not* 

(b)  to  check  types  and  fail  gracefullyi  or 

(c)  to  check  types  and  return  a  Null  result. 

There  are  many  tactical  variations)  which  to  use 
depends  on  the  level  of  understanding  between 
caller  and  callae,  and  it  ie  important  not  to  pre¬ 
empt  the  decision  in  system  design. 


Capability  management 


To  form  a  new  class  of  abstract  objects 
the  designer  requests  permission  from  the  system, 
which  returns  a  capability -forming- cap ability  (CFC) 
containing  the  index  a  of  the  new  class.  To  form 
a  new  capability  one  can  then  present  to  the 
system  that  CFC  together  with  the  object  index  id- 
Ir.  return  a  tag  6  user  capability,  class  o,  index 
id  ia  obtained: 


CFC 

and  objeat  id 
give 

usee  capability 


X 

' 

o 

lE 

^ . 

T3 

i  0  1  ol  td  l 


We  now  see  that  the  typical,  'package' 
dealing  with  a  class  of  objects  consists  of  a 
manager  M,  whose  name  is  made  public,  and  essen¬ 
tially  private  data  structures  such  as  the  master 
object  table  and  CFC  whose  names  (m^  and  m^)  are 

excluded  from  other  segments.  Disclosure  of  M  will 
also  document  the  functions  of  its  entry  points. 

The  'difficult'  aspects  of  M  are  concerned  with 
index  management  which,  as  we  saw  earlier,  leads 
to  various  forms  of  evasion.  In  microPN,  system 
support  is  offered  to  dslete  either  (a)  a  given 
capability  or  (b)  a  capability  class  (authorised 
by  the  CFC)  from  program  space. 


Inevitably,  pointers  must  be  scanned 
looking  for  euch  capabilities.  In  a  multicomputer 
system  the  rate  of  scanning  store  has  two  impor¬ 
tant  characteristics:  (1)  it  is  relatively  high, 
because  of  the  close  connection  between  processors 
and  memories,  and  (ii)  it  is  roughly  constant 
because  additional  memory  brings  with  it  additional 
processing  power.  As  a  result  we  can  suggest  index 
management  strategies  based  on  the  use  of  small 
m.o.t.'s  whose  entries  are  recycled  when  no  longer 
in  use. 


For  practical  reasons  store  allocation  is 
serviced  by  a  special  set  of  system  functions,  but 
the  above  comments  on  index  management  are  equally 
applicable  to  codewords  and  addresses.  The  planar 
memory  functions  are  particularly  important  in 
store  compaction, 

*  *  * 

In  summary,  it  might  be  said  that  the  main 
problem  of  microsystem  design  is  not  to  invent  new 
facilities  but  to  select  a  basic  subset  from  the 
range  of  possibilities  on  offer.  It  is  paradox¬ 
ical  that  at  a  time  of  great  abundance  in  hardware 
the  need  for  stringency  in  design  is  greater  than 
ever,  but  the  fact  remains  that  there  are  great 
dangers  from  'overkill'  in  hardware  and  software. 
In  microPN  the  decisive  facto**  are  the  need  to 
maintain  security  at  microprogram  level,  and  un¬ 
willingness  to  suffer  loss  of  performance  in  doing 
so.  Enphasis  is  therefore  placed  on  the  ability 
to  oonstruot  high  level  systems  rather  than  commit 
the  design  in  one  direction  or  another. 
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Simulation 

The  PN  design  is  based  on  a  computer  mod¬ 
ule  assumed  to  be  comparable  in  speed  and  complexity 
with  current  microprogrammable  machines.  Besides 
playing  its  part  as  a  member  of  an  assembly,  each 
must  satisfy  the  moat  exacting  requirements  of 
program  reliability  and  language  implementation, 
which  carry  over  (still  unsatisfied)  from  conven¬ 
tional  design.  Before  making  specific  hardware 
recommendations  it  is  necessary  to  study  in  depth 
the  program  organisation  and  behaviour  that  can  be 
expected  in  practice,  so  the  approach  has  been  to 
simulate  one  computer  module  and  to  make  measure¬ 
ments  from  which  the  performance  of  an  assembly  can 
be  inferred.  The  simulator  runs  under  the  UNIX 
operating  system  on  the  PDP-11  series  of  computers. 
Facilities  available  include  a  system  implementation 
language,  system  support  and  error  management  func¬ 
tions,  library  and  on-line  documentation. 

A  multitask  system  is  simulated,  and 
between  any  two  control  points  it  is  possible  to 
count j 


(i) 

instructions  obeyed 

(ii) 

local  store  accesses 

(iii) 

global  store  accesses 

(iv) 

stack  usage 

(v) 

procedure  calls 

(vi) 

module  interconnections 

(vii) 

planar  functions  oboye. 

(viii) 

planar  routing  distance 

(ix) 

interrupts. 

and 

Elapsed  time  in  the  host  system  is  also  available, 
and  PN  system  functions  can  readily  be  modifiod  to 
give  measures  of  resource  usage,  static  measures  of 
Instruction  coding,  etc.  The  significance  of  the 
above  figures  should  be  clear.  Taking  store  traf¬ 
fic  as  the  main  parameter  of  performance,  it  is 
found  that  for  every  100  bytes  of  instruction  about 
30-60  further  bytes  of  data  are  hanilod-  If  the 
data  were  all  global,  the  overhead  of  segment  table 
access  would  thus  be  25-40%,  but  that  is  never  the 
case:  it  is  rare  for  leS3  than  90%  of  data  accesses 

t:o  be  local  and  we  conclude  that  the  overhead  is 
negligible . 

As  already  shown,  change  of  access  list  is 
implicit  in  moving  from  one  section  oi  code  to  an¬ 
other,  but  for  each  register  saved  or  restored  at  a 
domain  boundary  six  bytes  of  data  and  instruction 
are  used.  A  complete  task  change  in  microPN  gener¬ 
ates  about  500  bytes  of  store  traffic,  while  the 
search  of  external  names  associated  with  module 
interconnection  generates  about  200  bytes  (50  instr¬ 
uctions  obeyed).  In  scanning  operations,  about  one 
machine  instruction  is  obeyed  for  each  pointer  ex¬ 
amined,  so  that  a  typical  stack  (lass  than  100 
tagged  values)  would  be  scanned  in  lOpsec. 

The  above  figures  begin  to  provide  the 
context  for  high  level  program  design  decisions, 
e.g.  whether  to  use  global  or  local  workspace,  how 
to  distribute  segments  across  computer  modules, 
when  to  use  advanced  forms  of  binding,  what  mix¬ 
ture  of  interpretive  and  in-line  control  to  use, 


and  so  on.  Quite  often  a  high  performance  figure 
is  traded  for  some  other  attribute  such  as  resil¬ 
ience  or  responsiveness  which  is  difficult  to 
quantify.  A  vital  objective  is  to  achieve  perfor¬ 
mance  in  convertible  shape:  that  applies  partic¬ 
ularly  to  the  levels  of  abstraction  and  control, 
because  their  interfaces  undoubtedly  decide  whether 
what  is  possible  in  theory  is  actually  achieved  in 
a  practical  system. 

Costs  are  equally  difficult  to  quantify, 
and  care  must  be  tahan  to  compare  designs  with 
similar  facilities.  For  software  engineering, 
controlled  pointer  formation  is  far  more  effective 
than  segment/page  table  control  and  costs  much  le3s 
on  a  gate-for-gate  basis.  Ttie  most  conspicuous 
cost  of  microPN  Is  the  16*16  bit  planar  arithmetic 
unit,  whose  main  contribution  to  the  oy.Ttcm  is  in 
memory  management.  Its  use  enables  the  dedicated 
control  and  scratchpad  stores  normally  found  in 
microprogrammed  machines  to  be  dispensed  with,  thus 
removing  a  serious  obstacle  to  microsystem  support 
for  high  level  languages)  it  also  enables  a  far 
more  flexible  approach  to  be  taken  in  capability 
management  and  program  construction  than  has  been 
possible  in  Burlier  systems.  Whether  it  will  be 
justified  on  balance  remains  to  oe  seen. 

Finally,  it  should  be  stressed  that  the 
mechanism  outlined  here,  while  appropriate  to  the 
control  of  program  space,  does  not  preclude  the  use 
of  other  abstraction  devices.  It  would  be  possible, 
for  example,  to  superimpose  a  capability  mechanism 
extending  into  the  file  space.  It  would  b<>  advan¬ 
tageous  to  deal  with  some  forms  of  abstraction  by 
'soft'  methods  in  the  confines  of  particular  lan¬ 
guages.  On  the  other  hand,  to  make  the  basic 
architecture  part  of  a  language  or  file  system 
specification  would  be  fundamentally  bad  design. 
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l.tf  Abstract 

'me  'iput/output  Interface  has  tradition* 
ally  been  a  source  ot  trouble  in  computer  sya- 
tana.  A  helrarchiral  model,  baaed  on  Finite 
S.ate  Machines  a-j  appropriate  to  ootn  hardware 
and  software,  is  presented  which  addresses 
these  problems,  nils  Model  is  of  Interest  for 
several  reasons:  first,  it  surest s  a  struc* 
ture  for  the  design  of  input/output  subsystems; 
second,  it  Is  amenable  to  automatic  manipula¬ 
tion  using  well-known  algorithm.  (e.g.  state 
minimization);  third,  it  is  easily  and  elti- 
ciently  implemented  in  software,  firmware,  or 
hardware;  fourth,  automatic  generation  of 
taata  is  poaible. 


2.u  Tntroduction 

while  progress  has  been  i-  ,.e  in  other 
areas  of  computer  system  design,  the 
Input/output  area  has  been  totally  neglected, 
we  speak  of  an  architecture  as  being  'language 
directed'  to  Indicate  that  it  embodies  the  fhi- 
loaopny  of  a  language.  Me  recognise  an  in¬ 
struction  sat,  say,  aa  being  high  level.  Ur  we 
construct  a  memory  system  to  ensure  an  abstract 
requirement  such  as  security.  But  try  aa  we 
may,  no  guiding  principles  can  found  for 
input/output  systems,  'bout  the  only  general 
statement  to  be  made  is  that  data  is  transport- 
ad  between  the  outside  world  and  the 
procesaor/wemory. 


High  level  languages  have  long  been  looked 
to  as  unifying  concepts  for  processor  nnd  sto-i 
rage  architecture.  Significantly,  input/output 
interfaces  are  programmed  almost  universally  in 
art  assembly  language,  not  a  hlgn  level 
language.  It  is  symptomatic  of  the  lack  of 
progress  in  this  area  that  the  programs  which 
deal  with  i/o  are  still  constructed  in  the  most 
primitive  language.  High  level  languages  are 
considered  to  be  too  inefficient.  This  points 
out  tite  lack  oi  a  unifying  structure  at  tf » 
input/output  interlace. 

we  wish,  to  investigate  Input/output  inter¬ 
action  at  the  actual  hardvere/software  inter¬ 
face.  Previous  work  (1,21  has  emphasized  the 
notion  of  a  device  as  an  asynchronous  process, 
mis  is  appropriate,  since  synchronization  is 
an  important  issue  in  dealing  with  peripheral 
devices,  ‘mis  paper,  though,  deals  with  the 
input/output  system  at  a  different  lev  el— the 
actual  hardware/software  interface,  me  two 
views  are  complementary  in  that  we  do  not  re¬ 
move  asynchronous  activity  from  the  i/o  area, 
but  rathtr  present  a  more  software  compatible 
view  of  the  i/o  interface  for  the  device 
processes  to  deal  with. 

3.0  Current  Practice 

(ilven  that  a  particular  place  of  equipment 
is  to  be  connected  to  a  computer  system,  typi¬ 
cally  a  hardware  designer  steps  in  and  designs 
a  controller.  The  hardware  designer  ia  given 
the  device  input-output  characteristics;  these 


may  Involve  a  fairly  large  number  of  analog 
and/or  digital  lines  subject  to  varying  electr* 
ical,  physical,  and  logical  constraints,  The 
product  of  the  hardware  designer's  labors  is 
the  logical  device  visible  to  trie  programmer  as 
a  set  of  io  ports  or  memory  registers.  Then  a 
prototype  is  built  and  the  nardware  debugged. 

wow  a  programmer  enters  tlie  scene  and  ue* 
signs  a  device  driver  (or  handler)  to  connect 
tlie  logical  device  to  the  operating  system 
(and,  in  turn,  to  application  level  programs). 
The  starting  point  for  the  prograinner  is  the 
logical  device  constructed  by  the  hardware  der 
signer.  Tlie  logical  device  appears  as  a  coir 
lection  of  bits  which  represent  status  or  coat* 
mands  and  a  data  register  for  data  or  adr 
dresses.  The  lines  to  the  device  which  had  a 
very  distinct  Identity  to  the  hardware  designer 
^have  become  a  homogenous,  somevhat  anonymous 
\  sat  of  bits  to  the  programmer.  In  the  case  of 

f  the  status  and  control  bits,  they  may  be  mixed 

n 

1  together  (note  that  status  bits  are  to  be  read, 

•  end  aommand  bits  are  to  be  written) ,  and  inciv 
Mentally  grouped,  the  problem,  though,  la  that 
,  t  prograomar  tands  to  viaw  the  device 
oneedlmanslonally.  All  status  bit*  or  all  com* 
mand  bits  are  viewed  as  being  equally  important 
on  the  same  level.  But  all  bits  are  not  equal* 
ly  important?  the  hardware  designer  under* 
b tands  this  and,  for  exanple,  will  not  allow 
the  controller  to  function  if  the  device  is  not 
initialized.  This  onewdimenaional  view  leads 
the  programmer  to  checK  the  status  of  the  dev* 
ice  thru  such  code  sequences  ass 


begin 

if  statusbit'  ! 

*  on 

then 

if  statusbltOu 

*  on 

then 

»  t 

if  statusbit(p) 

•  on 

then 

end  ? 

or 

begin 

it  statusoit(m)  *  on  then  ... 

elseit  statosuit(n)  =  on  then  ... 

e  •  e 

elseii  statusbit(p)  *  on  then  ... 

end 

or  some  combination  of  the  two.  In  the  first 
case,  the  nunber  of  possible  paths  thru  the 
code  for  n  status  bits  is  2**n.  Note  that  in 
many  device  Interfaces  the  importance  of  the 
status  bits  is  not  at  all  apparent  *  that  is, 
there  is  no  simple  way  to  determine  the  impor* 
tance  of  the  status  bits.  The  programmer  must 
check  bit  5,  then  bit  7,  then  bit  3,  or  check 
different  sequences  of  bits  according  to  wheth* 
er  a  bit  is  on  or  not.  Actually,  the  situation 
is  even  worse?  complex  devices,  such  as  com* 
munlcatl^n  drivers,  can  require  different  beha* 
vior  to  the  same  status  depending  upon  the 
prior  history  of  the  device. 

Finally,  the  programmer  has  a  driver  de* 
sign,  rts  codes  it  and  must  debug  it.  This  can 
be  a  harrowing  experience,  for  the  programmer 
is  confronted  with  a  new  piece  of  hardware 
which  may  maltunction,  or  he  may  have  misunder* 
stood  just  tiow  the  controller  works,  or  the 
nardware  designer  stay  have  given  him  a  con* 
troller  which  is  difficult  or  unwieldy  to  deal 
with,  or  his  code  way  be  incorrect,  or  ***  (tlie 
reader  is  invited  to  fill  in  other  reasons) . 
The  driver  may  not  work  and  he  can't  tell  it 
the  problem  is  hardware  or  his  software,  bo  he 
calls  in  the  hardware  designer  to  help  him. 
But  now,  communication  between  the  two  may  com* 
pound  diiiiculties. 
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•mis  paper  will  attempt  to  solve  these 
problems.  a  finite  state  machine  (PSM)  model 
will  be  presented  which  is  suitable  tor  imple- 
mentation  in  hardware,  software,  and  firmware. 
The  f'Srt  has  several  desirable  properties  which 
make  it  attractive  ar,  a  hardware/  software  im- 
pleuienation  vehicle.  The  designer  is  forced  to 
explicity  account  for  ail  situations  wnich  may 
arise.  It  is  sufficiently  high  level  to  sirve 
as  a  common  design  language  while  hiding  low 
level  implementation  details.  It  is  amenable 
to  automatic  manipulation  using  well-known  air 
gorltnms  (3J.  Given  suitable  restrictions  on 
the  model  (i.e.,  tuerarchical  structure),  and 
forcing  the  interface  registers  (the  logical 
device.!  to  conform  to  a  certain  standard  format 
which  is  particularly  economical  (in  hardware) 
and  efficient  (in  software)  reduces  ttie  nunber 
of  states  to  a  manageable  set.  In  fact,  ex¬ 
haustive  testing  may  become  feasible. 
Automatic  generation  of  tests  is  also  possible 
[4,b,6J.  Lastly,  ttie  flirt  collects  together 
sufficient  information  to  provide  a  history  of 
operation  which  can  be  useful  for  checkout, 
testing,  and  performance  evaluation. 

4.0  A  General  Model 

In  designing  an  I/U  subsystem,  both  data 
and  control  must  be  considered.  At  the  operate 
ing  system  interface,  control  is  simple  and  the 
data  complex;  at  ttie  device,  the  data  is  sim¬ 
ple  and  the  control  complex,  tor  example,  an 
array  (buffer)  of  words  is  presented  to  ttie  I/U 
subsystem  with  a  reguest  for  transfer,  'the  T/u 
subsystem  attempts  the  transfer  and  replies 
with  either  success  or  failure.  At  ttie  lowest 
level,  though,  single  words  might  be  trans¬ 
ferred  one  at  a  time,  with  an  acknowledgement 
after  eech  transfer.  An  error  will  cause  re* 
tries  or  a  failure  status  to  be  returnee. 

'hie  general  model,  tlien,  is  neirarchically 
structured.  Lacli  level  translates  a  single 
command  into  a  set  of 


connands  to  a  lower  level  (the  control  consi¬ 
deration) .  Also,  a  data  type  is  translated 
into  a  different,  more  detailed  data  type  for 
the  next  lower  level,  bach  luvsl,  then,  succe* 
sively  refines  both  control  and  data  to  a  more 
detailed  form.  Adjacent  levels  share  common 
control  and  data  structures.  The  next  section 
will  present  a  more  specific  model  for  the  re¬ 
alization  of  the  I/O  subsystem. 

'j.v  finite  state  Machines 

Toe  reader  is  assumed  to  be  familiar  with 
ttie  concept  of  a  finite  State  Machine  (PSM) 
[7].  f'Sms  will  be  briefly  defined  in  order  to 
present  notation.  We  shall  deal  with  the  Mealy 
model  of  an  fbrt  as  it  seems  to  offer  technical 
simplifications  for  our  purposes. 

A  finite  State  machine  is  defined  as  a  sextuple 
<  S,  I,  U,  NSP" ,  UP,  Bit  > 
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b 

- 

a  set  of  states 

T 

A 

- 

a  set  of  inputs 

0 

- 

a  set  of  outputs 

MSP 

- 

a  next-state  function 

IlSP  !  Sxl  *>  S 

OSf 

— 

an  output  function 

Of  :  Sxl  ->  0 

oS 

- 

the  initial  state 

where  there  can  be  no  confusion,  we  may 
omit  explicitly  listing  the  various  sets.  We 
will  rely  on  context  to  implicitly  define  them 
by  giving  the  Next  State  and  Output  functions 
either  in  tabular  form  as  in  figure  la  or 
in  graphical  form  as  in  figure  lb. 

An  f'Srt  operates  as  iollowB:  It  begins  op¬ 
eration  in  its  initial  state.  Heceiving  an 
input,  it  performs  some  output  dependent  on  it* 


■tat*  and  input,  than  it  moves  to  anothar 
•tata,  again  according  to  its  current  state  and 
input.  the  process  repeats  continuously, 
figure  2.  shows  a  skeleton  program  which 
implements  this  process. 

6.0  rieirarchical  Finite  State  Machines 


transition  back  to  its  initial  state,  the  oper* 
ation  of  the  submachine  ceases,  and  the  next 
higher  leval  machine  (vdiich  invoked  the  aubma* 
chine)  resumes  operation.  Note  that  since  the 
submachine  initiates  and  terminates  activity  in 
the  same  state,  it  has  no  memory  of  previous 
incarnations. 


A  Heirarchical  Finite  State  Machine  (HFSM) 
is  a  set  of  machines  M[i]  ,  i>0,  such  that 

M[J]  is  an  FSM  i 

<S[i),  Hi] ,  Oil] ,  »Ji'[i),  0FU1,  S0[i]  > 
augmented  by 

<  EStU,  ILfUl  > 
where 

ES  is  contained  in  S[i] 

ILF [1)  s  ES[il  *>  Mljl  where  j>i  . 


we  mention  in  passing  that  an  HFSM  is  ext 
actly  equivalent  to  a  much  more  complex  FSM. 
Thus,  an  HFSM  has  no  greater  theoretical  power 
than  an  FSM.  Practically,  though,  it  has  sev* 
eral  advantages: 

1.  Heirarchical  structure  which  may  be 
designed  and  implemented  in  a  top-down 
fashion. 


As  is  often  the  case  with 
automata* theo retie  definitions,  the  formalism 
i!  appears  complex,  yet  the  operation  of  the  de- 
>■'  fined  machine  is  simple. 

Intuitively,  an  HF»:  is  a  collection  of 
FSrts  with  a  mapping  between  the  states  of  a  me* 
chine  at  one  level  and  the  machines  of  the  near 
lower  level.  That  is,  a  state  of  a  machine  at 
leval  i  may  be  associated  (by  an  Intar*Level 
Function  ILF)  with  a  machine  at  level  l+l .  Not 
all  states  need  be  mapped  to  a  lower  level  me* 
chine:  The  states  that  are  ao  mapped  are 


2.  A  clear  separation  of  concerns 
(inputs*outputs)  at  each  level. 

3.  An  HFSM  may  be  implemented  with  less 
mamory  than  the  equivalent  FSrt,  since 
the  HFSM  is  a  collection  of  small  FSMs 
rather  titan  large  FSM.  The  nextstate 
and  output  functions  grow  as  the  pro* 
duct  of  states  and  inputs,  and  a  sin¬ 
gle  FSM  may  require  a  large  amount  of 
memory  to  represent  these  functions. 

6.1  Inputs 


termed  explosive  (the  set  ES  in  the  above  de¬ 
finition)  .  Figures  3,4  illustrate  the 
structure  of  a  simple  HFSM.  The  hfhm  operates 
simllerly  to  sn  FSM  with  one  exception:  when 
an  explosive  state  is  reached,  the  execution  of 
the  HFSM  at  that  level  is  suspended,  and  the 
submachine  corresponding  to  the  explosive  state 
is  activated.  The  submachine  starts  in  its  in* 
itlal  stats,  and  sxscution  commences  around  its 
state  transition  graph.  The  submachine  may,  in 
turn,  contain  explosive  states,  in  which  cose  a 
subvsubmachlne  is  recursively  activated,  and  so 
forth.  When  the  submachine  finally  makes  the 


Inputs  are  usually  specified  in  simple  ex¬ 
amples  as  single  symbols,  for  example,  '0‘  or 
*1'.  Implicitly,  we  mean  two  distinct  events: 
first,  that  an  input  is  present,  and  second, 
that  the  input  has  some  given  value,  we  wish 
to  deal  with  asynchronous  systems,  so  input 
evaluation  doss  not  occur  until  an  input  Is 
prssont. 


For  certain  systems,  a  single  Input  symbol 
may  not  be  sufficient.  In  that  case,  an  input 
can  be  considered  to  be  a  condition  which  is  to 
be  evaluated  as  true  or  false.  Only  one  input 


may  °*  true.  The  single  input  symbol  is  a  spe¬ 
cial  csss;  it  is  simply  tht  condition  input  * 
symbol  . 

6.2  HfSrt  Data 

The  previous  ssction  presented  tbs  flow  of 
control  of  an  ufUM.  To  bs  useful,  though,  it 
must  bs  possibls  to  pass  data  through  tha  HfSM. 
Several  data  buffsrs  are  providsd  to  each  ma* 
chinsi  an  inputvoutput  pair  to  bs  used  for 
communicating  with  tha  nsxtshighar  lsval  (i.s. 
tha  invoking)  machine,  and  an  input vout put  pair 
for  aach  axploaiva  stats  to  bs  uaad  for  cooaaun* 
icating  with  aubmaachlnam.  Dots  that  bacauaa 
tha  atm  is  a  strictly  sequential  machins,  ona 
pair  of  data  buffsrs  may  be  used  to  communicate 
with  all  next  lower  lsval  machines,  four  prim* 
itivss  are  providsd  for  utilizing  these 

hnf 

1.  Head  from  Above  (HA)  -  Head  the  uata 
cutter  containing  data  from  the  next 
higher  level  machine. 

2.  write  to  itoove  (IWv)  -  Write  data  into 
the  data  butter  of  the  next  nigher 
level  machine, 

3.  Head  from  below  (Hd)  -  Head  the  data 
buffer  containing  data  written  by  the 
submachine  corresponding  to  the  last 
explosive  state. 

4.  write  to  below  (Wb)  -  write  to  the 
data  buffer  which  can  be  read  by  the 
submachine  corresponding  to  the  explo¬ 
sive  state  being  entered. 

Mote  that  toe  data  passed  by  the  write  to 
below  (Wo)  function  to  tlie  next  lower  level, 
and  received  there  by  tits  Heed  from  Above  (ha) 
function,  must  agree  in  type.  Similarly,  the 
Heceive  from  below  (Hd)  and  writs  to  Above  (wa) 
functions  must  agree  in  type. 


7.B  Hardware  Implementation 

The  implementation  of  an  Hf6K  in  hardware 
is  fairly  straightforward.  It  is  similar  in 
operation  to  the  software  version  presented 
earlier,  however,  certain  additions  are  mads 
in  order  to  facilitate  tasting  and  to  aecomow 
data  the  lower  bandwidth  commwlcmtion  channel 
between  the  device  controller  and  main  memory, 
bach  machine  may  be  implemented  in  its  most 
convlenient  foremost  likely  as  a  micro  prow 
g  rammed  controller  [8).  In  this  conectlon  not* 
that  the  control  of  all  machines  is  identical. 

Bach  machine  must  provide  to  the  software 
driving  it  the  information  listed  in  figure  XX. 
All  fields  art  encoded  as  small  in  tag era  so 
that  simple  indexed  table  lookwups  and  CASE 
statements  may  be  used  to  access  the  HfSrt.  the 
machine  10  field  identifies  tha  submachine. 
The  state,  input,  and  output  fields  describe 
the  machine  state  and  its  environment.  So  far, 
the  hardware  implementation  is  exactly  the  same 
as  the  software  version.  One  extra  item  is 
added  to  the  tiardware  version:  the  machine  10 
interrupt  level,  this  is  a  register  loaded  by 

ttw  soltware  driver  at  initialization  time 
which  specifies  which  machine's  state  transi¬ 
tions  cause  interrupts  (or  equivalently,  when 
software  interaction  is  needed) . 

Transitions  of  machines  which  have  IDs  not 
equal  to  tiie  machine  Xu  interrupt  level  proceed 
at  their  own  rate,  the  machine  specified  in 
this  register  is  the  highest  level  machine  in 
the  hardware.  'Ihis  is  the  hardware  macine 
which  interacts  with  the  software  machine. 
The  lower  level  machines  are  simple,  execute 
quicxly,  and  do  not  require  software  interven¬ 
tion. 


•-*1 
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8.0  Testing 

Testing  the  hardware  portion  of  the  ht'Sm 
is  made  possible  by  tne  variable 
hardware/software  interface,  the  machine  Jo  re¬ 
gister.  In  the  event  of  a  hardware  failure, 
indicated  by  illegal  state  transitions  and  the 
like,  tne  software  can  test  the  hardware  por¬ 
tion  of  the  Hf'Srt.  The  test  portion  of  the 
software  contains  a  duplicate  implementation  of 
fiie  lower  levels  of  tlie  hardware  machines.  Of 
course,  this  duplicate  is  not  used  for  normal 
operation,  but  only  for  testing.  Tne  software 
sets  the  machine  XU  interrrupt  level  to  the 
next  lower  level  machine  and  executes  a  prede¬ 
fined  test.  It  compares  the  execution  of  the 
hardware  machine  to  its  own  simulation  and  re¬ 
cords  the  differences.  Tnese  differences  lo¬ 
cate  tlie  taulty  state  transitions.  If  there 
are  no  discrepancies,  then  It  repeates  tlie  pro¬ 
cess  on  the  next  lower  level  machine.  It  con¬ 
tinues  checking  lower  level  machines  until 
faulty  transitions  are  isolated. 

9.8  Unresolved  Issues 


countered  in  practice,  of  reasonable 
computational  expense.  In  passing,  we 
note  that  the  Parts  which  we  have  used 
are  fairly  small  (9-18  states  and  a 
like  number  of  inputs) .  References 
[11,12J  suggest  a  connection  with  syn¬ 
chronization  using  regular  path  ex¬ 
pressions. 

2.  Certain  types  of  exception  conditions 
are  not  cleanly  handled.  Asynchronous 
exceptions  arising  from  an  external 
source  do  not  fit  the  model  well  as 
they  are  not  the  response  to  some  ac¬ 
tion  ana  may  occur  in  the  middle  of 
some  conceptually  indivisible  action. 
An  example  ot  this  type  of  condition 
is  a  power-failure  indication.  It  is 
not  possible  to  guarantee  tnat  the 
t»wer  fail  occurs  only  when  tne  ma¬ 
chine  is  in  certain  states  at  some 
given  level.  Une  might  simply  include 
a  power-fail  transition  in  every  state 
for  every  machine,  but  this  is  an  un¬ 
satisfactory  solution. 


Wtile  rtt'SMs  are  attractive  as  a  means  ot 
structuring  liardware/software,  there  are  sever¬ 
al  areas  of  conventional  usage  which  do  fit 
well  into  the  model. 

1.  Concurrent  activity  cannot  be  ex¬ 

pressed  within  the  model.  An  tbrt  can¬ 
not  represent  concurrent  threads  ot 
control.  More  general  models,  such  as 
Petri  nets,  can  represent  concurrent 
activity  [9]  and  nave  been  used  as 
hardware/software  models  [18).  nut 

tnese  models  seem  t<  lose  some  ot  tne 
essential  simplicity  ot  the  Mrt  model, 
tuthermore,  many  ot  the  interesting 
properties  of  these  models  are  either 
undecldable  or  computationally  expen¬ 
sive.  In  contrast,  PSrt  guestions  are 
all  decidable,  and  for  most  pans  en- 


lw.k)  Conclusions 

'hie  input/output  interface  has 
traditionally  been  a  source  of  trouble 
in  computer  systems.  Reasons  tor  this 
include  a  lack  ol  communication 
between  hardware  and  software  de¬ 
signers,  lack  of  a  unifying  framework 
for  hardware  and  software  specifica¬ 
tion,  and  an  inability  to  completely 
test  nardware/software  interlaces  re¬ 
alistically  due  to  the  large  nuuuer  of 
states  involved.  The  problem  is  par¬ 
ticularly  apparent  in  tne  programs 
which  mate  a  piece  of  hardware  (tot 
example,  a  peripheral  controller)  to 
an  operating  system. 
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procedure  fsm  ; 
type 

nextatatetype  *  array [1 . . 2, 1. . 2]  of  integer  ; 
outputtype  »  array(1..2,1..2]  of  integer  j 
const 

nextatate  ■  nextstatetype  ((  1,  2  ) 

(  1,  2  ))> 

output  *  outputtype  ( (  1 «  2  ) 

(  2.  2  ))» 

var 

currentstate  t  integer  j 
currentlnput  s  integer  j 

procedure  getlnput  (var  inp  :  integer)} 
begin  {  getlnput  ) 

e  •  • 

end  {  getlnput  ]} 

begin  (  fan  } 
repeat 

getlnput  (currentlnput)} 
case  output (curentstate,  currentlnput]  of 
It  ... 

2  s  ... 
ns  ... 
end  } 

currentstate  nextstate (currentstate,  currentlnput]} 
until  forever  t 
end  (  fan  }  » 


figure  2 •  fsn  Skeleton  Program. 


procedure  hfsm  > 
type 

hfsins  -  record 

lnitlalstats  s  Integer  } 
currentstete, 

currentlnput  s  Integer  ■, 
nextstnte  s  array (Inputs, states)  of  Integer  i 
output  s  arraylinputs, states]  of  integer  } 
errclosive  s  array  (status)  of  boolean  j 
submachines  s  array (states]  of  integer} 
end  } 

begin 

currentstate  s »  initlalstate} 

repeat 

getlnput (currentlnput) i 

case  outputlcurrentstate, currentlnput)  of 
Is  ... 

2  s  ... 
ns... 
end  } 

currentstate  nextstate (currentlnput , cur rentstate] i 

If  exploslve(currentstate] 

then  hf so (submachine ( cur rent rtate] ) } 
until  currentstate»lhitialsiate  t 
end  } 


figure  3 .  HPSM  f rogram  Skeleton 
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Abstract 

The  software  problem,  measured  in 
such  terms  as  the  high  cost  required  to 
develop,  test,  debug,  and  maintain  pro¬ 
grams,  and  the  high  degrees  of  com¬ 
plexity  and  unreliability  in  programs, 
is  now  the  major  obstacle  to  computing, 
from  microprocessor  applications  to 
large-scale  systems.  One  partial  solu¬ 
tion  is  bringing  semiconductor  tech¬ 
nology,  in  the  form  of  improved  archi¬ 
tectures,  to  bear  on  the  problem.  In 
doing  so,  the  contention  is  that  machine 
architectures  should  not  be  oriented 
toward  Just  programming  languages,  but, 
more  importantly,  provide  mechanisms  on 
which  software  systems  concepts  can  be 
readily  based,  and  provide  a  more 
consistent  programming  environment. 

SWARD,  an  experimental  architecture, 

'  is  discussed  as  an  example  of  how  a 
machine  architecture  can  assist  in  the 
solution  of  the  software  problem. 

Introduction 

There  i3  widespread  agreement  that  the 
development  of  software  is  the  largest 
problem  in  the  computer  field  today.  The 
problem  Is  manifested  in  the  following 
ways.  First,  the  production  of  software  is 
a  costly  venture.  The  great  leaps  forward 
in  the  cost  of  digital  hardware  have  not 
been  experienced  in  software  development. 
Where,  in  the  past,  the  software  cost  of  a 
computing  system  was  outweighed  by  hardware 
costs,  the  opposite  is  the  case  today.  For 
instance,  the  cost  of  producing  a  single 
Instruction  in  a  program  for  a  micro¬ 
processor  system  probably  exceeds  the  cost 
of  the  processor. 

Second,  in  typical  software-develop¬ 
ment  projects,  more  than  50%  of  the 
development  costs  are  expended  in  the 
testing  and  debugging  processes.  Further- 


per  20  statements,  and  worse,  have  been 
reported  in  the  literature.  Hence,  a 
program  of  significant  size,  such  as 
100,000  statements,  might  Initially  con¬ 
tain  5000  errors  prior  to  inspections  and 
testing. 

Finally,  because  of  the  Increasing 
sophistication  of  computer  applications, 
software  errors  can  have  rather  serious 
consequences. 

These  problems  will  be  exacerbated  in 
the  future  by  the  Increasing  sophistica¬ 
tion  of  new  computer  applications  in  such 
areas  as  artificial  Intelligence,  defense 
systems,  transportation  and  energy 
management,  and  electronic  fund  transfer. 

Software  engineers  and  computer 
scientists  have  been  wrestling  with  the 
software  problem  for  the  ls«<t  decade. 
Although  improvements  have  been  made  in 
some  environments  and  organizations,  the 
problem  is  still  a  serious  one.  One 
reason  is  the  recent  explosion  of  the 
amount  and  types  of  programs  being  pro¬ 
duced.  Ten  years  ago,  the  typical 
programmer  could  be  found  producing  a 
simple  Cobol  application  or  developing  an 
operating  system  for  a  computer  manu¬ 
facturer.  Today  we  find  a  much  larger 
programmer  population  developing  such 
applications  as  chess-playing  programs  for 
consumer  games,  fuel/air  mixture  regula¬ 
tors  In  automotive  microprocessors,  coro¬ 
nary-analysis  programs  in  medical  equip¬ 
ment,  collision-avoidance  algorithms  in 
airoraft  systems,  guidance  programs  in 
nuolear  missile  warheads,  and  dispatching 
systems  for  police  and  fire  equipment. 
Another  reason  is  that  the  largest  areas 
of  software-engineering  research,  namely 
improvements  in  programming  languages  and 
mathematical  proofs  of  program  correct¬ 
ness,  have  not  yet  had  a  significant 
effect  on  the  software-development  process 
in  industry. 


designer  to  be  interested  in  doing  so. 
Given  the  continuing  reduction  in  hardware 
costa,  the  processor  manufacturer  must 
sell,  its  product  in  increasingly  larger 
volumes.  Doing  so  requires  increasingly 
larger  amounts  of  software,  and  requires 
movement  of  computer  technology  into  new 
application  areas.  The  rate  of  sale  of 
computer  hardware,  from  microprocessors  to 
lnrge-ucale  systems,  is  directly  related 
to  how  quickly  the  required  system  and 
application  software  support  can  be  pro¬ 
duced,  and  the  reliability  of  that  soft¬ 
ware  . 


An  Approach  to  the  Problem 

The  answer  to  how  hardware  technology 
might  help  alleviate  the  software  problem 
is  not  the  simplistic  approach  of  "moving 
software  to  silicon,"  since  there  is  no 
evidence  that  the  problems  mentioned  above 
will  disappear  by  merely  shifting  respon¬ 
sibility  for  the  design  task  from  the  pro¬ 
grammer  to  the  circuit  or  logic  designer. 
Rather,  tine  answer  is  designing  machines 
that  provide  less-hostile  environments  for 
programs,  programmers,  and  end  users.  The 
architect  must  now  face  up  to  broader  con- 
si. derations  ,  such 

!.  Ways  in  which  the  architecture  can 
simplify  the  task  of  application  pro¬ 
gramming,  for  instance,  by  providing 
support  for  more-potent  concepts  of  input/ 
output  and  data  manipulation  in  pro¬ 
gramming  languages. 

Ways  in  which  the  architecture  can 
encourage  the  use  of  good  software  design 
and  programming  practices,  for  instance  by 
providing  efficient  support  for  concepts 
of  program  modularity,  information  hiding, 
abstract  data  types,  and  structured  pro¬ 
gramming.  The  motivation  here  and  in 
point  1  is  the  prevention  of  programming 
errors. 


3.  Ways  in  which  the  architecture  can 
assist  the  coBtly  processes  of  software 
testing  and  debugging,  for  instance  by 
detecting  or  preventing  common  programming 
errors  and  by  providing  a  more-flexible 
base  for  the  development  of  software 
testing  and  debugging  tools. 

A .  lays.'  in  which  the  architecture  can 
reduce  the  complexity  of  one  of  the  most- 
complex  classes  of  software,  namely  com¬ 
pile  ra.  Such  support  Involves  reducing 
the  semantic  gap  bewteen  languages  and  the 
.architecture  by  tailoring  the  operations 
and  objects  provided  in  the  architecture 
more  closo.ly  to  the  corresponding  concepts 
:l  n  programming  languages. ^ 

D.  Ways  in  which  the  architec ture  can 
reduce  the  complexity  of  another  complex 


class  of  programs  -  operatic,,  sy.-te-:-,. 
This  might  imply  increased  awarent.:; 
the  architecture  of  such  concepts  as 
protection,  process  management,  .  ro’ts.1. 
synchronization  and  communlcat Vc-  .  ■ 

memory  management . 

Considerations  such  ar  -.r.e  ;  x  ,  < 
been  addressed  in  the  1  Iterator  1  *' * 
have  had  little  impact,  ar.  .  e*. ,  s- 

commercially  available  co>-;  ter  $ 

The  SWAKb  Architecture 


An  example  of  an  approach  tc  so .  c : r . 
the  software  problem  is  an 
system  under  development  at  the  l/.v 
Systems  Research  Institute.  A ; tr — « 
current  definition  of  the  archite. :  .isx  *-• 
not  been  published,  it  has  evgly^.!  re<- 
ear’lier  published  versions.  *  *9,  “ 

The  five  sets  of  considerat ivr 
listed  in  the  previous  section  :.re  . 
design  objectives  of  the  architecture. 
Detailed  objectives  were  derive!  for 
of  the  categories.  Many  of  these  o;  ec - 
tives  are  mentioned  in  the  followm  di.- 
cussion  of  the  architecture. 

The  major  attributes  of  the  archi¬ 
tecture,  and  some  of  their  relationship: 
to  the  software  problem,  are  outlined 
below. 


Tagged  storage.  The  concept  of 
tagged,  or  self-identifying,  storage  la 
used  throughout  the  architecture  to  allow 
the  machine  to  understand  unambiguously 
the  attributes  of  the  operands  of  an 
instruction.  This  allows  the  machine  to 
detect  operations  on  incompatible 
operands  and  to  perform  automatic  data 
conversions  during  instruction  processing. 
Each  data  type  has  a  unique  representation 
for  the  "undefined"  state,  allowing  the 
machine  to  detect  attempts  to  use 
undefined  values. 

The  tagged  data  elements  (called 
cells)  are  variable  in  size.  The  archi¬ 
tecture  contains  no  /ixed-size  word 
concept  and  permits  machine  instructions 
to  address  only  cells  as  operands;  hence 
the  data  model  provided  by  the  architec¬ 
ture  closely  corresponds  to  the  data 
models  in  programming  languages. 

Nested  tags.  The  tagged  storage 
concept  was  extended  to  allow  tags  to  be 
embedded  within  other  tags,  allowing  the 
representation  of  higher-order  data  types 
as  arrays,  structures/records,  and  user- 
defined  types.  The  machine,  rather  than 
the  program,  handles  the  task  of  array 
addressing,  and  automatically  perform:; 
bounds  checks.  The  architecture  also 
contains  explicit  representations  of 
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arrays  of  structures/records  and  "based 
variables." 

Capability  based  addressing.  The 
architecture  employs  the  addressing  and 
protection  concept  of  capability-based 
addressing.  The  architecture  views  the 
world  as  a  set  of  objects,  each  being 
given  a  unique  name  by  the  machine  when 
created.  Programs  cannot  fabricate  or 
Manipulate  addresses,  and  any  reference  to 
an  object  after  the  object  has  been  des¬ 
troyed  results  in  a  detected  error. 

Capabilities  and  objects  are  used  to 
create  a  high-level  storage  model,  the 
elimination  of  traditional  low-level 
storage  concepte  being  another  objective 
of  the  architecture.  Figure  1  depicts  a 
possible  state  of  the  storage  model.  The 
architecture  recognlxee  five  types  of 
objects,  four  of  which  (module,  process 
machine,  port,  data-storage  object)  are 
explicitly  created  and  addressed  by  pro¬ 
gress  and  one  of  which  (activation  record) 
ia  implicitly  created  via  a  module  invo¬ 
cation. 

Full  generality  of  allowing  capa¬ 
bilities  to  reside  in  objects  is  provided; 
capabilities  are  protected  by  their  being 
one  of  the  15  tagged  cell  (data)  types. 

As  ah own,  the  architecture  also  uses  capa¬ 
bilities  to  reference  source/sink  (storage- 
leas)  1/0  dsvioes. 


Figure  1 


Single  level  storage.  The  concept  of 
virtual  storage  has  been  generalized  to 
the  extent  that  there  is  no  notion,  above 
the  architecture,  of  secondary  storage. 

For  instance,  the  concept  of  files  no 
longer  exists;  programs  use  arrays  to 
represent  what  would  have  been  considered 
to  be  a  file.  Hence  the  concept  of 
secondary-storage  I/O  has  been  eliminated; 
all  data  in  the  system  are  addressed  in  a 
uniform  way,  and  all  other  concepts  in  the 
architecture  (e.g.,  tagged  storage)  apply 
to  all  data  in  a  uniform  manner. 

Within  the  environment,  all  concepts 
of  storage  allocation  have  been  removed 
from  the  domain  of  software.  Although 
storage  allocation  does  occur,  it  is  done 
implicitly  by  the  machine,  for  instance, 
as  an  effect  of  a  module  invocation 
(where  the  machine  oreates  an  activation 
record  for  the  module's  local  variables). 
Rather  than  being  able  to  allocate  space, 
programs  are  presented  with  a  function  to 
allocate  occurrences  of  cell  types,  such 
as  strings  "and  array s~Tthe  dynamic 
allocation  of  which  is  embodied  in  a 
data-storage  object). 

Small  protection  domains.  Each  sub- 
rout  ine^FproceHure^F^Tprogr am  is 
represented  by  a  module  object,  which 
contains  the  generated  instruction  stream 
and  a  definition  of  the  module's  address 
space  (a  set  of  tagged  cells).  This 
structure  is  shown  in  Figure  2.  Instruc¬ 
tions  in  a  module  can  address  items  only 
within  the  private  address  space,  although 
well-controlled  indirect  references  can  be 
made,  via  parameters  and  capabilities, 
outside  of  the  address  space.  Thus  the 
architecture  enforces  rules  of  program 
modularity,  limits  the  consequences  of 
errors,  and  protects  a  program,  including 
the  system  software,  from  itself. 

Automatic  subroutine  management.  The 
architecture  removes  the  burden  of  subrou- 
time  management  from  the  shoulders  of  the 
compilers  by  containing  instructions  that 
perform  all  that  is  implied  by  a  subroutine 
call  in  a  high-level  language.  For 
instance,  the  CALL  instruction  saves  the 
state  of  the  current  module,  creates  and 
initializes  an  activation  record  for  the 
called  module,  switches  address  spaces, 
and  begins  execution  of  the  called  module. 
The  attributes  of  arguments  and  parameters 
are  verified  for  consistency  during  each 
call. 

Figure  2  shows  that  a  module's  add¬ 
ress  space  is  partitioned  into  two  sec¬ 
tions  -  the  "static  storage  die"  and 
"automatic  storage  die."  Cells  in  the 
static  storage  die  reside  permanently 
within  the  module  object.  When  a  module 
entry  point  is  called,  the  machine  creates 
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Figure  2 

an  activation-record  object  containing  a 
copy  of  the  definition  of  the  cells  in  the 
automatlc-storage-die  part  of  the  address 
space.  When  an  instruction  refers  to  a 
cell  in  the  automatic  storage  die,  the 
machine  automatically  maps  this  reference 
to  the  corresponding  cell  in  the  current 
activation  record. 

Hierarchical  fault-handling 
meohanisnu  The  architecture  contains  a 
uniform,  process-oriented,  rather  than 
system-oriented,  mechanism  for  the  hand¬ 
ling  of  error  conditions,  called  faults. 
Any  module  can  contain  a  special  fault¬ 
handling  entry  point  and  specify  which 
types  of  faults  can  be  handled  there. 

When  a  fault  is  detected  in  a  module,  the 
machine  searches  back  through  the  activa¬ 
tion  history  of  the  process,  looking  for 
the  first  module  that  has  indicated  a 
desire  to  handle  that  type  of  fault.  When 
one  is  found,  the  machine  "calls"  that 
entry  point  (i.e.,  simulates  a  subprogram 
call),  passing  it  five  arguments  describ¬ 
ing  the  fault  and  the  state  of  the  program 
at  the  time  of  the  fault.  What  happens 
after  that  is  a  function  of  the  fault- 
handling  software  in  the  module.  However, 
the  architecture  provides  several  instruc¬ 
tions  to  terminate  a  fault  handler  and  an 
instruction  to  explicitly  raise  fault 
conditions. 

process  machines.  One  of  the  five 


A  process-machine  object  has  the  character¬ 
istics  of  a  hardware  processor  and  thus 
creates  a  multiprocessor  environment;  how¬ 
ever,  the  mapping  of  process  machines  to 
hardware  processors  is  a  matter  of  hardware 
implementation,  not  architecture.  (At  one, 
extreme,  a  single  hardware  processor  can 
time-slice  itself  to  act  as  all  process 
machines. ) 

By  creating  and  destroying  process 
machines,  programs  create  and  destroy 
processes.  In  keeping  with  the  design 
rules  followed  throughout  the  architecture, 
this  entity  defines  only  a  mechanism,  out 
of  which  programs  can  create  policies. 

Also,  it  is  orthogonal  with  other  concepts 
i'  the  architecture  (e.g.,  process  machines 
have  no  relationship  to  addressing). 

Send /receive  mechanism.  Two  machine 
instructions,  SEffo  and  RECEIVE,  and  an 
abstract  object,  a  port,  are  provided  for 
interprocess  communication.  The  SEND 
instruction  is  defined  almost  identically 
to  the  CALL  instruction,  except  where  CALL 
transfers  control  and  a  set  of  arguments 
to  a  module  entry  point,  SEND  transfers  a 
set  of  argument  values  through  a  port. 

That  is,  it  transfers  data  but  not  control. 
As  with  the  subroutine  call  mechanism,  type 
checking  occurs  across  the  send/receive 
interface.  As  mentioned  earlier,  source/ 
sink  devices  are  represented  by  capabili¬ 
ties,  and  one  does  I/O  operations  on  these 
devices  by  use  of  SEND  and  RECEIVE. 

The  mechanism  iB  synchronous  to  the 
extent  that  a  process  machine  executing  a 
SEND  instruction  halts  until  another 
process  machine  receives  the  transmitted 
values.  Thus  the  mechanism  is  similar  to 
the  rendezvous  concept  in  the  Ada  lan¬ 
guage  , 

Generic  instructions.  The  concept  of 
lagged  storage  allows  the  architecture  to 
be  defined  with  a  small,  highly  regular, 
generic  instruction  set.  For  instance, 
there  is  only  a  single  instruction  for 
performing  addition  -  ADD  -  and  only  a 
single  instruction  for  transferring  values 
in  storage  -  MOVE.  The  semantics  of  the 
instructions  are  defined  by  the  attributes 
of  their  operands.  For  instance,  the  MOVE 
instruction  can  be  used  to  store  an 
integer  value  in  a  floating-point  data 
cell  (doing  an  automatic  data  conversion), 
store  one  character  string  in  another, 
store  a  scalar  value  into  all  elements  of 
an  array,  or  set  one  array  equal  to 
another.  One  of  the  benefits  of  this  is 
significant  simplification  of  compilers, 
particularly  the  code-generation  process. 


Instruction  to  address  and  move  sub¬ 
strings  within  strings,  a  search  instruc¬ 
tion  to  search  an  array  for  a  matching 
value,  and  an  iterate  instruction  embody¬ 
ing  the  full  semantics  of  iterative  DO 
loops  in  such  languages  as  Fortran  and 
PL/ 1, 

For  process  synchronization,  the 
architecture  contains  two  instructions 
named  GUARD  and  UNGUARD.  They  can  be 
used  to  prevent  simultaneous  execution  of 
two  or  more  processes  through  a  critioal 
section  of  instructions  and  were  motivated 
by  the  software  design  and  synchronization 
concept  of  monitors. 12 

Transparent  indirect  addressing.  The 
concept  of  capabilities  has  been  expanded 
to  allow  capabilities  to  point  to  other 
capabilities  such  that,  if  a  program 
refers  to  a  capability,  the  machine  will 
Interpret  this  as  a  reference  to  the  last 
capability  in  the  chain.  This  concept 
can  be  used  for  added  levels  of  data 
security,  by  an  operating  system  for 
access  control  of  objects,  and  to  allow 
one  to  dynamically  replace  objects  (e.g., 
modules)  in  a  program  while  the  program 
is  executing. 

Program  tracing  facilities.  Instruc- 
tions  exist  to  activate  the  tracing  of 
branches  taken,  branches  not  taken,  and/ 
or  calls  in  specified  modules.  When  such 
events  occur,  they  are  treated  by  the 
machine  as  faults  and  thus  the  fault- 
handling  mechanism  mentioned  above 
Applies. 

Additional  security  features.  In 
addition  to  the  protection  concepts  of 
capabilities,  small  protection  domains, 
and  indirect  capabilities,  the  architec¬ 
ture  contains  additional  security  fea¬ 
tures,  such  as  the  ability  of  a  program 
to  restrict  the  copying  of  capabilities, 
an  instruction  to  assign  a  new  unique 
name  to  an  object,  and  a  second  level  of 
protection  provided  by  the  use  of  tagged 
storage. 

Semantic  checking.  One  of  the  major 
objectives  of  the  architecture  is  detec¬ 
tion  of  large  classes  of  semantic  errors 
in  programs,  errors  that  are  (1)  frequent, 
(2)  difficult  to  debug  when  they  occur  in 
conventional  systems,  (3)  common  to  many 
or  all  programming  languages,  and  (4)  in 
general,  not  detectable  at  the  time  of 
program  compilation.  Examples  of  a  few 
of  the  27  classes  detected  are  (a)  use  of 
undefined  data  values,  (b)  references  to 
nonexistent  array  elements,  (c)  the 
dangling-reference  problem,  (d)  data  type 
ambiguities  (e.g,,  inconsistent  declara¬ 
tions  of  global  data),  and  (e)  mismatch¬ 
ing  arguments  and  parameters.  Studies 
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have  indicated  that  these  errors  represent 
30-50%  of  all  errors  in  typical  programs. 

Virtual  machine.  Although  not  an 
explicit  objective  of  the  architecture, 
attributes  of  the  architecture,  such  as 
capabilities  and  objects,  have  given  it 
the  characteristic  of  being  a  virtual- 
machine  environment,  meaning  that  programs 
can  exist  having  no  relationship  to  the 
operating  system,  and  multiple  operating- 
system  environments  can  coexist. 

Relevance  of  SWARD  to  the  Software  Problem 

The  SWARD  architecture  is  unique  in 
that  almost  every  aspect  of  the  architec¬ 
ture  was  motivated  by  a  desire  to  alle¬ 
viate  the  software  problem.  The  major 
ways  in  which  this  is  achieved  are  dis¬ 
cussed  below. 

The  extensive  semantic  checking  per¬ 
formed  by  the  machine  should  enhance  sig¬ 
nificantly  the  productivity  of  the  software 
testing  and  debugging  processes,  and  lessen 
the  consequences  of  errors  occurring  in 
production  programs. 

The  object  orientation  of  the  archi¬ 
tecture,  and  the  use  of  capability-based 
addressing,  presents  a  highly  uniform 
system  environment.  The  objects  of  the 
architecture  (modules,  process  machines, 
ports,  data-storage  objects),  as  well  as 
source/sink  I/O  devloes,  are  addressed  in 
an  identical  fashion.  This  has  important 
implications  on  the  complexity  of  system 
software  and  the  user  environment.  For 
instance,  where  conventional  systems  con¬ 
tain  a  variety  of  dissimilar  mechanisms 
for  the  binding  of  entities  (e.g.,  a 
"linkage  editor"  for  binding  program 
modules  together,  control-language  state¬ 
ments  and  "open"  services  for  binding 
programs  to  files),  an  operating  system 
can  be  defined  with  a  single  uniform 
concept  of  binding. 

The  single-level  store  concept, 
particularly  when  carried  forth  into 
programming  languages,  largely  eliminates 
the  need  for  1/0  concepts,  allowing  the 
programmer  to  think  of  data  in  a  uniform 
way. 

The  use  of  the  SEND  and  RECEIVE 
instructions  as  the  basic  I/O  primitives 
for  source/sink  devices,  as  well  as  for 
interprocess  communication,  has  several 
benefits.  First,  it  adds  another  measure 
of  uniformity  to  the  system,  since,  for 
instance,  there  is  no  difference  among 
sending  a  character  string  to  a  printer, 
terminal,  or  another  process  through  a 
port.  Hence  there  is  only  one  concept  of 
data  transmission.  Second,  it  allows  one 
to  substitute  processes  for  I/O  devices, 
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■  r  j/0  devices  l'or  procosse::,  wi  Hi.n.t 
.  I ..jtif.'  1 1 lj’  one's  program.  Third,  the  actnl/ 
tv celv’  mechanism  is  synchronous  with  res- 
t  to  whatever  is  on  the  other  aide  (i/u 
novice  or  process).  Hence  there  is  only 
•-.tie  concept  of  parallelism  in  the  system  - 
the  process.  There  is  no  concept  of  an 
interrupt . 

Other  unifying  ideas,  all  of  which 
:  erve  to  moke  the  programming  environment 
■i  less-complex  and  less-hostile  one,  are 
tno  fault-handling  mechanism,  for  error 
handling)  capability-based  addressing  for 
Information  sharing  and  protection,  the 
highly  generic  Instruction  set,  and  no 
need  for  a  privileged  instruction  state. 

The  development  of  well-structured 
,i ograms,  employing  concepts  of  modularity, 
information  hiding,  and  parallel  processes, 
lr  encouraged  by  the  machine  concepts  of  an 
i  iTicient  subroutine-management  mechanism, 
small  protection  domains,  the  fault-hand¬ 
ling  mechanism,  the  single-level  store,  the 
GUARD,  UNGUARD,  SEND,  and  RECEIVE  instruc¬ 
tions,  and  others. 

'"he  points  above  apply  to  the  pro¬ 
gramming  environment  in  general ,  but 
several  additional  points  can  be  made  about 
compilers,  operating  systems,  and  data-base 
management.  Because  of  the  concepts  of 
tagged  storage,  direct  recognition  of 
higher-order  data  types  such  as  arrays  and 
structures,  the  generic  instruction  set, 
and  the  power  of  the  instruction  reper- 
tolrc,  the  development  cost  and  complexity 
of  compilers  should  be  significantly 
reduced. 

For  many  of  the  same  reasons,  and 
because  of  other  facilities  in  the  machine, 
ihe  overhead  and  development  cost  of  high- 
level-language-oriented  testing  and 
debugging  tools  3hould  be  greatly  reduced. 

i ne  architecture  also  eliminates  much 
"f  (. In:  traditional  complexity  of  operating 
••lystei.is  and  other  subsystems  by  removing 
t rom  them  the  problems  of  memory  manage¬ 
ment,  protection,  process  synchronization, 
Interprocess  communication,  and  interrupts. 

The  use  of  generic  instructions  and 
tugged  storage  implies  the  latest-possible 
binding  of  instructions  and  data;  the 
semantics  of  an  instruction  are  determined 
at  the  time  of  its  execution,  using  the 
Information  in  the  tags  of  its  operand 
cells.  SWARD  extends  this  even  further  by 
allowing  the  programmer  to  incompletely 
specify  the  attributes  of  a  local  variable 
in  its  tag;  this  allows  a  local  variable  to 
acquire  dynamically  some  or  all  of  its 
attributes  (e.g,,  from  a  parameter).  These 
points  have  significance  to  the  concept  of 
data  independence  in  data  l.aco  envlron- 
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liven  the  magnitude  of  the  ..oftware 
problem  today  and  an  appreciation  for  how 
much  worse  it  will  be  tomorrow,  and  given 
the  rapid  advances  in  hardware  technology, 
the  time  seems  ripe  for  major  architecture 
redirections  that  make  fundamental  improve¬ 
ments  in  the  programming  environment.  The 
SWARD  architecture  serves  as  an  example  of 
how  n  machine  architecture  can  reduce 
software  complexity  and  lessen  the 
difficulty  and  error-proneness  of  program 
design,  coding,  testing,  and  debugging. 
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ABSTRACT 

Given  the  advances  of  technology,  it  ii  not  unreeeonible  to  pro¬ 
ject  the  existence  of  multiple  proceeeor  configurations  tbit  have 
large  number*  of  processor*  with  a  variety  of  Interconnection 
possibilities 

Thi*  paper  discusses  laaguage  cooatructi  tor  iaterproceu  commu¬ 
nication  and  proceaa  creation  function*  which  would  be  funda¬ 
mental  to  *y*tcma  that  run  eel*  of  program*  dispersed  across 
famllie*  of  logical  prooaeaon.  Certain  divergence*  between  vari- 
ou*  concept*  of  interprooe**  communication  are  resolved  in  a 
single  dedin 

INTRODUCTION 

Recent  dramatic  development*  in  procctimr/ memory  technology 
and  in  interconnection  methodologie*  for  the  aaaoclatiou  of  proc¬ 
essor*  with  each  other  auggeit  that  future  multiple  proceiaor 
configurations  may  have  largo  number*  of  fait  and  cheap  proc¬ 
essor*  with  a  variety  of  memory  sharing  possibilities!  1,2, 3, 8]. 

An  objective  of  such  multiple  processor  systems  will  be  the  need 
to  quickly  and  dynamically  react  to  the  changing  demands  on  the 
•ystem.  This  will  imply  the  need  to  not  only  group  a  set  of  proc¬ 
essor*  to  work  on  a  given  set  of  applications  but  will  also  imply 
the  need  to  dynamically  partition  memory  spaces  which  arc  physi¬ 
cally  common  amongst  these  set  of  processor*.  For  convention 
we  will  refer  to  a  aat  or  processors  and  memory  formed  dynami¬ 
cally  as  a  logical  jytttm. 

In  such  systems  sub-configurations  of  closely  cooperating  generic 
multiprocessors  may  be  formed  and  partitioned  sets  of 
'distributed’  configurations  may  be  formed  between  unit*  with  a 
rich  diversity  of  decisions  about  memory  sharing,  code  replication, 
etc.  The  intent  of  tbs  concept,  of  coons,  is  to  allow  systems  to 
take  shapes  appropriate  tor  their  applications.  To  support  this 
notion  various  speeds  of  memory  and  processors  would  be  availa¬ 
ble  so  that  various  oonoepts  of  application  speed  and  partitioning 
can  be  supported  by  decisions  about  processor  and  memory  speed 
and  capacity. 

Some  important  concepts  of  dynamic  configurability  should  exist 
in  the  system.  Seta  of  closely  cooperating,  memory-shared  proc¬ 
essors  should  be  dynamically  definable  for  ahott-period*,  co¬ 
operating  or  indapaadant  sab-conflgurationa  of  "distributed" 
systesas  should  also  be  definable  tor  brief  periods.  Where  desira¬ 
ble,  permanent  "gangs"  of  associated  processor*  at  different  levels 
of  memory  and  operating  system  sharing  should  also  be  definable 
within  the  total  population  of  processors,  memories  and  other 
resources  of  tbs  system.  A  goal  of  such  a  system  Is  to  make  maxi¬ 
mum  use  of  the  erell  known  coooept  that  logical  systems  structures 
of  varying  kinds  of  Nations  hips  and  dome***  of  cooperation  can 
be  maooed  onto  physical  structures. 


DESIGN  CONCEPTS 

A  very  well  known  way  of  structuring  aa  operating  system  is  to 
define  vertical  partitions  of  functions  such  that  that*  la  a  func¬ 
tional  module  for  I/O,  memory  manage  meat,  process  rnmmunir* 
tlons,  process  synchronization,  etc.  The  grant  advantage  to  the 
structure,  of  course,  is  that  it  allow*  muhipie  parallel  services  to 
be  achieved  in  muhipie  proceeeor  aavtronmaots. 

The  structure  can  be  supported  by  hardware  la  a  number  of  ways. 
Each  functional  modal*  can  be  located  la  protected  address 
spaces  la  a  large  single  physical  processor/ memory.  Aa  interesting 
attribute  of  a  capability,  object  masagemeat  architecture  such  as 
SWARDft]  I*  that  the  physical  configumUoa  of  memory  is  logi¬ 
cally  Irrelevant.  Configurations  can  be  formed  with  various  de¬ 
gree*  of  shared  or  private  physical  mesnory  without  t»p*'~*i“g  the 
logic  of  the  object  nsnsgrumet  system. 

The  ability  to  assign  some  number  of  processor*  of  any  architec¬ 
ture  to  a  system  suggests  that  those  processors  easy  be  used  aa 
Global  Service  processor!,  each  assigned  to  a  rignlflsant  operating 
system  function  of  ths  type  suggested  above.  There  may  be  a 
Systems  Wide  Message  Handler,  a  Systems  Wide  Global  Schedu¬ 
ler,  A  Systems  Wide  I/O  server,  etc. 

In  an  alternative  structure,  each  processing  nods  could  be  com¬ 
posed  of  two  processing  nodes.  Conceptually  one  aright  think  of  a 
Problem  State  element  and  a  Suparvieor  State  element.  All  those 
activities  which  would  be  executed  In  suparvieor  stale  in  S/370 
architecture  would  be  executed  in  one  element,  wfaOa  an  those  in 
problem  state  In  another  element.  Although  this  serve*  a*  a  con¬ 
ceptual  example,  It  is  not  clear  that  this  particular  partitioning  of 
function  between  elements  of  a  nod*  is  ths  proper  partitioning 
point.  The  discovery  of  a  proper  partitioning  between  computa¬ 
tional  element  and  operating  system  element  depend*  upon  a 
number  of  factors  which  include  frequency  of  function,  instruction 
set  restrictions,  the  degree  of  asyuchroulety,  etc.  A  fall  back 
concept  it  to  view  the  operating  system  element  as  a  kind  of 
network  processor  which  beoosae*  Involved  only  whsa  the  associ¬ 
ated  computational  processor  hsuas  a  request  which  will  involve 

interaction  with  another  station  in  the  network.  This  may  be 
falling  back  too  far  since  it  placet  In  the  computational  procemor 
the  burden  of  determining  when  an  off-station  refarsaoe  mast  be 
made  and  this  effort  may  be  largt  compared  to  making  the  hater- 
action  Itself.  It  is  prefers  bit  for  the  operating  system  pro  earner  to 
determine  what  and  when  off-station  rsfersaoe*  must  be  mad* 
while  the  computational  processor  proceeds  with  other  available 
work. 
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la  such  a  system,  when  each  node  ii  comprised  of  it  lent  la 
operating  lyittn  prooassor  end  t  computational  processor.  sscb 
opentia*  system  element  ha*  ■  functioneUy  equivalent  local  oper- 
itint  system  Uut  participate*  in  global  system  dedeioas  and  glob¬ 
al  >ystem  services  a*  well  u  providing  local  support.  This  conatl- 
tuta>  the  baste  for  a  completely  distributed  control  system  in 
which  intensive  interaction  is  sustained  between  stations  sad 
where  negotiation  and  co-operation  lead  to  system  wide  decisions 
about  work  distribution.  A  hem  processor  for  e  unit  of  work  may 
be  discovered  by  Interaction  sad  negotiation  with  operating  sys¬ 
tem  elements  aware  of  what  their  associated  computational  proc¬ 
essor  is  doing.  This  negotiation  goes  on  without  disturbing  the 
progress  of  ivallabls  work  on  a  local  dispatch  Hat.  The  following 
sections  address  them  dstign  objective*  with  respect  to  the  isn- 
gusge  constructs  and  control  structures  for  process  creation  and 
later-process  communication. 

TERMINOLOGY 

in  the  system  we  are  about  to  describe  we  introduce  the  following 
terminology: 

PROCESS  CONTROL  TERMINOLOGY 

APPLICATION  -  An  application  defines  s  contest  by 
Indicating  a  set  of  procedure  and  data  objects  that  may 
be  sees  said  and  states  the  rules  of  reference.  The  appli¬ 
cation  describes  both  the  physical  and  abstract  resource 
constraints  necessary  and  permissible  for  processes  be¬ 
longing  to  the  application. 

PPOCEDUPB  -  A  procedure  is  program  ten  and  capabil¬ 
ities  for  an  Incarnation  of  n  procam  or  an  instance  of 
activation.  lie  unique  feature  in  this  system  Is  a  statement 
of  nonsumahl*  reeoorct  constraints  in  addition  to  the  Ust 
of  abstract  resources  (such  as  tllss,  data  bases,  locks,  etc) 
necessary  far  successful  procasstag. 

PROCESS  -  The  System  dispatch! bis  unit.  A  tuck  of 
activation  records,  each  associated  with  a  procedure, 
resting  upon  a  pro ocas  activation  block  that  may  be  used 
tot  reoovety.  A  precsss  is  nsassd. 

USEE  •  Bosh  mar  at  dm  system  is,  of  course,  defined  to 
the  system.  Part  of  tUs  definition  is  a  Ust  of  the  total  set 
of  appUcatlaaa  which  the  umr  can  connect  to. 

PROCESS  COMMUNICATION  TERMINOLOGY 


PORT  -  A  port  may  be  a  to  part  or  a  from  port.  On  a 

Sander's  aide  a  /rem_pen  is  s  named  place  la  s  sender's 
program  (e.g.  a  declared  structure  in  the  PL/I  sense) 

.  wyin*  u  ths  source  of  a  message  to  be  tent.  A  lo  port 
is  the  aaam  of  «  receiving  proosir.  On  s  receiver's  side  a 
f^petvis  s  named  place  la  the  receiver’s  program  where 
the  message  w*  be  placed.  A  from  tort  U  the  new  of 
a  sending  pvwmu. 

PATH  -  A  path  can  be  either  a  queue  same  or  a  file 
name.  The  path  represents  an  Indirection  from  the  sender 
to  either  *  specific  receiver  or  to  an  arbitrary  receiver. 

The  exact  detail*  of  inter-pro cem  communication  will  be  deferred 
till  the  section  on  process  communication. 


PROCESS  CREATION 

On*  of  the  major  objective*  in  introducing  new  language  con¬ 
structs  is  to  insun  that  in  so  far  m  is  reasonable  the  '»«g"*g*  for 
application  programming  is  the  same  u  the  user's  comauad  lan¬ 
guage.  Not  having  this  as  sa  objective  results  in  increased  com¬ 
plexity  in  requiring  s  user  to  isarn  more  than  one  '-"gu-g-  for 
performing  the  same  Identical  function.  Por  sinridty,  PASCAL 
I*  used  at  the  largusge  for  syntax  expression  In  this  section  and  in 


the  nest  section  (7.9J,  though  the  language  constructs  presented 
are  not  unique  to  PASCAL. 

In  the  system  wt  arc  presenting,  there  are  users  who  initiate 
processes  (which  can  Initiate  still  more)  which  run  under  a  given 
application  scope.  Consequently,  the  following  declarative  struc¬ 
tures: 

type  uttr  m  recced 

application _ ttl:  set  of  application; 

default  appl:  application; 


type  application  m  record 

namt_  space:  set  of  name _ pair; 

default  process:  procedure; 
proceuor  resource:  proc_req. 
abstract  resource;  set  of  resource; 


name:  alfa; 

object:  object _ descriptor ; 


type  procure? 


min _ processors:  integer; 

max  processors:  Integer; 

min _ memory;  integer; 

max_memory:  Integer; 

Instruction  Ml:  machine _ type; 

performance:  set  of  perform_req; 


type  procedure  -  record 


entry  point:  program; 

name__tpoce:  sol  of  name _ pair; 

procetear_reeourte:  proc  req, 
abstract _ resource  sot  of  resource; 


The  'letds  i  t  the  above  records  are  described  as  follows: 


USER  The  application _ set  Is  a  Ust  of  application 

nesses.  These  represent  tbs  total  set  of  allowable  appl  ka¬ 
lians  that  s  given  user  is  allowed  to  access.  The 
dtfeub_appi  Is  tbs  application  that  a  user  will  be  auto¬ 
matically  connected  to  whan  be  LOGONi  to  tbs  system. 
This  field  is  optional.  Creation  of  an  object  of  type  usee 
assuming  ths  creator  has  lbs  'right'  to  crests  such  sa 
object,  (e.g.  var  hal,  batty:  user  )  results  In  the  system 
creation  of  a  user  object  Whea  a  user  Issues  s  LOGON 
to  Dm  system  (e.g.  LOGON  hal).  the  system  searches  for 
the  user  object  named  hal.  If  not  found  then  the  LO¬ 
GON  1*  rejected,  otherwise  the  user  object  is  searched  for 

a  dtfeuh _ appl  name  (e.g.  hal  dtfeult _ appl  -  null!).  If 

tpectftod,  than  the  user  wBl  be  connected  to  an  Instance 
of  that  application  (oat  will  be  created  If  it  does  not 
already  exist). 

APPLICATION  •  Aa  application  defines  the  universe  of 
icrasribfHty  for  all  pen  reams  and  users  connected  to  it. 

The  name  specs  Is  therefore,  s  set  of  name _ pain 

(representing  the  objects  that  can  be  accessed).  The  first 
elsawnt  la  ths  pair  I*  the  name  of  the  object,  and  the 
second  Is  ths  discriptor  of  the  object  mapped  to  by  the 
name.  Included  in  the  descriptor  are  the  rights  of  access 
(such  as  the  primitives  Read,  Writ*,  Execute),  and  the 
type  of  ths  object  (such  as  queue,  procedure,  file,  nested 
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The  dtfaull procaat  to  the  BUM  of  (be  procodurt  to  be 
invoked  (u  a  prooece)  whan  u  tontaao*  of  the  appiic a- 
Horn  it  crested.  This  field  It  optional  sad  slow*  is  apptt- 
catian  to  Implicitly  luitteUxs  hi  relevant  structure*.  Of 
course,  reeolstioa  of  tbs  mm  it  through  the  defised 

fljry//flWkWWffW  yffff 

The  procouor_nocama  n pretest!  the  processor  require¬ 
ments  of  the  app licatian  sad  It  self  explanatory.  Upon 
coeepleUoo  of  tbit  creation  of  sa  instance  of  an  applica¬ 
tion «.  prooettor  resource!  are  tliocatad  to  the  application. 
The  allocated  set  it  referred  to  at  a  pant.  All  created 
procoeeet  that  belong  to  aa  application  run  in  the  pent 
associated  wbh  that  application. 

The  procedure  repreeeata  a  model  for  either  the  creatioo 
of  a  process  (either  implicitly  or  through  osage  of  the 
START  command  to  be  discussed  below)  or  the  creation 
of  an  activation  within  a  prooeat  (through  the  itaodard 
program  call  iaterface).  Tbs  processor  requirement  tped- 
ficatkm  In  the  gmcadurr  shows  for  the  tailoring  of  a  (Iv¬ 
es  praoeat  to  the  juawtaihis  raaouro*  regab-tmtu It  of 
the  program  Specifically  performance  objectives  tech  m 
priority,  degree  of  I/O  hoaaihwlaaat,  deadhmes,  etc  caa 
be  atstsd  la  the  perfonmaatm  requkeateod  of  the  grace- 
dure.  The  oatm_+aco  daflaet  the  ao ope  of  ti-neaalhlUty 
of  the  activation  spawned  fro*  the  groondum.  If  a 

natm_ipoc4  la  not  preaial  in  the  defiakion  then  the 
caller't  name  Beer  1*  sensed.  It  thoald  be  noted  that 
there  need  not  be  aa  Intersection  between  up pltco- 
t  ion,  noma  -peer  aad  a  name  moot  defined  la  a  proct- 
dun  whose  name  Is  in  application.  nama_ipoc*. 

FrwWure .  sbttr*ct__  ntourca  Idealities  which  resources, 
tuck  at  dstt  betel,  filet,  lock!,  etc  have  to  be  allocated 
before  process/ activation  erection  oaa  occur. 

Given  this  beets,  are  caa  now  addraaa  props  as  aad  application 
creation.  Previously,  ws  have  thowa  hour  ao  application  (aad  kla 
default  protest)  caa  be  created  aa  a  reaah  of  a  uttr  perforating  a 
LOGON.  This  la  ItsalT  la  not  novel  aad  if  typical  of  many  interac¬ 
tive  ryatima. 

Wa  wa  will  aow  disease  aiphdt  process  aad  application  instance 
creation.  The  ooatiaaad  START  Is  aaad  for  both  pretest  aad 
application  Inttaace  creation,  aad  bat  the  folio  wing  format: 

START  oaciaUo,  mum 

Tbit  oofwattad  it  identical  to  the  form  that  would  ha  used  within  a 
process  to  create  another  process  or  application.  The  variable  it 
the  name  of  the  entity  being  darted.  If  the  issuer  is  not  already 
coast  reined  within  aa  application  Instance  then  START  assn-hat 
the  uttr  block  to  dctnmiti  if  the  variable  is  a  valid  application 
oasts.  If  It  is  not  than  the  request  is  rejected.  Otherwise,  aa  appli¬ 
cation  laataana  is  crested  aad  teaouroaa  aMncatad  (note:  the  ooa- 
samabt*  resources  allocated  am  transparent  to  the  cellar  aad  am 
only  knows  by  the  system).  If  the  application  hat  a 
default  pracou  defined  than  that  process  la  implicitly  created 

If  the  Issuer  la  already  coaatralaed  to  a  gives  appUcgtian  Instance 
(either  through  a  UNION  or  START)  than  the  variable  to  initially 
treated  a a  a  procadun  earn a.  la  tbit  case,  a  march  is  made  from 
the  application.  name__tpact  (for  a  first  dam  pro  mm  creation)  or 
from  the  ttatm  i pact  aaaodated  with  the  leaning  process.  If  a 
pracadmo  to  sot  found  from  the  taarcb,  than  a  march  to  made 
from  the  a sear  block  treating  the  variable  aa  aa  application  name.  If 
the  grenadine  to  found  tbea  a  process  to  created.  The  caller  to  not 
aware  of  where  the  created  procam  to  tanning 

The  mm  specified  or  START  to  the  caller  known  name  of  the 
created  entity.  If  t  pro  u  wa*  created  then  the  application,  mate 


represents  the  unique  name  o!  Urn  created  protest  IN  STTjtT' 
request  will  fall  if  the  Cellar  specified  name  to  already  associated 
with  another  pro  cam  in  ths  application  iaataao*.  The  uaegs  of  this 
ossm  will  become  apparent  la  the  dlacumioa  oa  later-prooam 
comukttiot. 


What  hat  been  shows  to  a  very  timpi*  way  to  effect  process 
creation.  Application t  aad  pro cs test  caa  be  created  aad  at  a 
result  logical  tytteam  caa  be  formed  dyaaariestty  aad  without 

explicit  installation  intervention.  The  dectoioe  over  whether  the 
logical  systems  am  distributed  or  tightly  coupled  ha  cornea  purely 
one  of  application  aad  gratitev  definition  which,  of  courts,  can 
also  be  dynamically  modified. 

Dynaaric  rhtagti  to  resource  consumption  rights  aad  processor 
atheduUng  oouctraiate  anaociated  with  any  creatioo  of  a  procadun 
may  be  made  by  timpk  aae  of  the  deoiarativ*  struct ure*  of  tha 
Itngaage, 

Tha  tchaduhag  constraint*  which  may  b*  aaaodated  with  START 
tuggttt  that  rather  ocmplti  (label  cyatamw  maaagameat  of  the 
type  aaati dated  with  totgt  teals  aalproceatctt  may  bu  a  faetar*  of 
a  aahipto  procaaaor  tggragatlv*  tyaum.  Than*  ---a— «-u-g  rate* 
may  be  aaforoad  by  a  gtefeul  ayatema  aohadater  soda  or  by  oo- 
operativ*  lateractioa  between  a  let  of  nparatteg  -jitirn  pcoc- 
taaort  which  am  wmoctotad  whh  tha  cmapMtelMai  prooamdta  of 
tha  tyatem  oa  a  oae-to-ome  or  —  1~  nitty  baste. 

INTIUl-PftOCBSf  COMMUNICATION 


Given  processes  the  next  Map  la  la  groridi*  them  whh  a - 

for  effecting  imter-prooeee  nnaummlnaHna  Far  this  function  we 
will  poetuiete  the  extotembe  of  aa  Itet-Frooaas  Coaotaalcator 
(IFC).  The  IFC  can  eadet  tether  aa  central  service  procaaaor  or  caa 
exist  at  a  distributed  terrios  la  each  of  the  lacteal  ayateate  de¬ 
fined.  Its  phyteite  trite  tan  la  hterahy  Iratltvteff  What  I*  terror- 
lam  to  that  tht  aarrisM  k  provMaa  rtaaalaa  tewiaat  ao  teetter 
where  tha  IFC  phyriaatty  rtridai.  Thai  ao  tpphoriici.  ruchdiag 
rirould  be  r Haired  If,  for  example,  the  deciteon  to  have  a  riobel 
0*C  proved  to  be  strong. 

W*  Will  postulate  a  Saad/Kecetv*  marhialam  arith  ftv*  verbs: 
CONNECT,  SEND.  RECEIVE.  SIGNAL,  ami  DISCONNECT. 

tu. - p.  -m  i — «— »-  is—  r— . — j-  — - ■-r|-1 

with  each  other  directly,  or  through  named  objects  in  a  tyachrom- 
Uad  or  tsynchroaout  maaaar.  The  abhky  to  tend  aad  metevo 
between  pro  re  eeis  aad  objects  permits  I/O  to  ba  suhtamad  ialo 
the  oommumkerioa  maahaaiaaa.  Connect  nfthtedna  a  path  be¬ 
tween  aa  lata  lag  prooem  aad  say  mmasd  abject  of  tea  tyatem. 
That  a  prooem  may  tubaaqatutly  SEND  rtauegu  to  another 
prooem  or  a  Mated  date  objtct,  Utteqa  mat  to  othte  yruntui 
may  be  paaaad  throogh  qutuet  or  mat  dimotfy  to  ports  of  t  re¬ 
ceiving  prooem.  A  laariftr  may  arit  for  mmtgN  horn  a  date 
object,  t  quuaa,  or  taariar  pmrtet  Oaa  to  many  niatteaaWg* 
may  ba  deftaed  to  Mpport  maamgs  hinadmltng-  hahfhtei  of  a 
■■magi  from  lay  of  t  set  of  poeribte  aervers,  ate. 
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Aa  Important  potaatiai  faster*  of 

doc  (IFC)  rear  ha  rite!  on  an  nmhhiftori  with  »tlf  dnacribhag  date 
(«]  to  tha  us*  of  tha  CONNECT  verb  to  describe  a - g-  temp¬ 

late  which  provide*  *  description  of  the  amaMgg  teratoma  which 
am  to  move  pvtifi^hr  A  omiM  inUm  to  VC 


merbaatomi  to  uacartriaty  abemt  whathtr  a  aemdt  or  mcrivur  to  t 
faah  when  maaaage  formate  do  at  match.  Thte  may  oocw  be- 
caaae  of  programariag  arran  wtoah  ame*  *  wrong  pan  or  qwsas 
for  tnnrmtoaina  or  raotept  of  a  pitlriln  manage  Tha  provtoton 
of  a  message  template  gfvat  the  IFC  a  maw  of  diim  Ming 

whether  wrong  namagss  or  bad^  formed - m  an  tha  re- 

•poote bflity  of  ths  madtr  or  mcaivt.  la  aystew  whan  date  to  aad 
describing,  aa  IFC  caa  check  ths  tags  of  a  manga  for  cctcrrity 
with  the  met  sage  tempiate.  Bach  reotever  ttwmrite  a  tmnMte  U 
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the  expected  mettage  format  which  is  alto  checked  against  the 
mettage  template  provided  by  the  CONNECT,  which  has  the 
following  format: 


CONNECT  connect  point 

where; 


If  CONNECT  it  used  the  parameters  may  come  from: 

1 .  totally  from  the  tender 

2.  totally  from  the  receiver 

3.  in  tome  combination  of  both 


type  connect _ point  -  record 

ports:  tat  of  message _ areas; 

path:  eat  of  (queue, file); 

message _ template:  message  format . 

caaa  (semi, receive)  of 

tend:(to _ port  .process _ name), 

receiw:(receiee _ point  .procedure; 

from _ port  process _ name); 


The  CONNECT  verb  can  be  uaad  by  both  Senderi  and  Receivers 

(hence  the  uaage  of  caaa  la  defining  the  connect _ point  type). 

Ports  specifies  the  location  of  the  measage  areas  in  the  Issuing 
process  to  be  used  either  at  the  location  for  receipt  of  messages 
or  for  the  submittioo  of  messages. 

Path  it  optional  and  specifics  an  indirection  point  In  the  transmis¬ 
sion  of  the  message.  The  object  of  the  path  is  physically  owned 
and  managed  by  the  tPC.  Usage  of  a  pash  in  the  transmission  of  a 
message  guarantees  the  recovery  of  that  message.  Queues  are 
transient  sod  exist  as  long  as  the  process  which  requested  Its 
creation  (this  process  can  be  different  front  either  sender  or  re¬ 
ceiver  anu  could  represent  a  caretaker  prooees).  Piles  are  perma¬ 
nent  and  have  to  be  expiicftiy  destroyed.  Submission  of  t  message 
through  a  park  guarantees,  in  general,  the  persistence  of  that 
message  0V*B  though  the  seodtr  end  potential  receiver  go  through 
untimely  termination.  Both  FIPO  and  LIFO  queueing  technique* 
are  applicable  with  queue*  and  file*  and  to  specified  when  the 
object  to  created. 

The  *aaMupr_M*«pfcer  spatoftoa  the  eendss/ruoatver’s  modal  for 
the  taasssaga.  It  itwmii.  lav  ewsapii,  the  laegth  of  the  -tmir. 
what  the  sunnHug  at  the  matonw  to  (ASCII.  Plead  rwrim.i 
l*a*tod.  etc),  and  ka  format  (for  a  multi -segmented  message). 
This  template  to  uaad  by  CONNECT  toe  comparison  with  a  temp¬ 
late  associated  with  the  peek  A  sender's  aad  receiver'!  template 
it  compared  with  the  path  teaapiate  If  there  Is  t  disagreement 
between  the  path  template  and  that  of  tender  or  receiver,  the 
process  with  the  divergent  template  to  notified  of  t  message  type 
error.  If  Hurt  Is  no  pestk,  the  sendtr't  and  receiver'!  templates  are 
compared  with  eueh  other.  In  case  of  an  error  both  processes  ire 
informed  of  a  mtomatch.  This  feature  it  most  practical  for  hard¬ 
ware  systems  that  have  strong  features  of  self-describing  dais  and 
tagged  memory. 

If  the  case  is  for  SEND  then  to  port  refers  to  the  process  name 
of  the  process  that  to  to  receive  the  request. 


If  the  case  to  for  RECEIVE  than  the  receive __potsu  refers  to  a 
procedure  that  la  to  ha  invoked  whan  another  process  issues  a 

SEND  (not  through  a  path)  to  that  proeom.  The  receive _ point 

represents  a  point  of  interruption  for  asynchronous  receipt  of 


It  to  pros*!*  to  mppmt  I*C  wtotoout  a  CONNECT.  If  an  CON¬ 
NECT  to  Imuaii,  ptaeamaa  mag  eemmuateata  dtoeetty  os  indbectly 
using  the  usual  nsgabHSy  soadroi  msohaaiama  of  the  oparateg 
ayatam  which  provide  for  ecqririag  names  of  procamas  aad  path 
objects.  In  tkto  uaage,  fuB  epadfication  must  occur  with  SENDs 
and  RECEIVES.  The  penalty  for  such  use  is  increased  risk  of  run 
time  failure. 


in  cate  (1)  or  (2)  the  parameters  associated  with  the  CONNECT 
art  impoeed  by  the  system  upon  the  relationship.  Cate  (3)  raises 
interesting  comdderationa  that  have  not  yot  been  fully  explored,  as 
to  the  degrees  of  freedom  between  SEND  end  RECEIVE  parame¬ 
ters.  For  example,  a  CONNECT  issued  by  a  receiver  that  names  a 
path  could  be  considered  inconsistent  with  e  CONNECT  issued  by 
a  sender  which  did  not  name  e  path.  However  we  may  convince 
ourselves  that  there  to  tome  advantage  In  having  transparent  to 
one  side  of  the  lend/receive  relation. 

The  operation  of  tending  t  mettage  can  now  be  described: 

SEND  token,  from  port,  palls,  to  port 

SEND  bee  four  operendt.  The  from  port  specifies  which  meeaage 
areas,  in  the  sending  process  contain*  the  transmission  measage. 
The  path  t pacifies  an  indirection  path  for  the  measage  (as  de¬ 
scribed  above)  aad  th*  last  operand,  a  to _ port,  identifies  the 

process  that  will  receive  th*  menage.  SEND  automatically  blocks 

the  issuer  uadi  either  the  meeeege  hat  been  placed  on  a  path  (if 
specified)  or  th*  receiving  process  (if  no  path  hat  been  specified) 

has  received  th*  meeeege. 

Specification  of  the  three  operand*  (from _jcrt.palh.to __port)  are 
optional  and  can  be  derived  from  the  preceding  CONNECT. 
Their  inclusion  on  SEND  to  to  allow  an  area  to  be  ueed  to  send 
meaesgea  to  more  than  one  counsel  point. 

In  fact,  if  to_jport  to  not  specified  io  either  CONNECT  or  SEND. 
then  path  mutt  be  specified  in  either.  In  this  cate,  the  measage 
will  be  placed  on  the  queue  or  file  by  the  IPC  and  the  oender  will 
be  SIGNALed  to  remove  It  from  the  blocked  state.  Such  metaages 
can  be  removed  by  any  process  which  haa  IPC  access  to  the  polk. 

if  path  is  not  specif  ltd  in  either  CONNECT  or  SEND,  then  s 

10 _ P°rt  mu*i  he  specified.  In  such  t  case,  the  mettage  is  tent 

directly  to  the  Motiving  prooees  (if  it  has  an  outstanding  CON¬ 
NECT  or  RECEIVE).  If  there  is  no  outstanding  RECEIVE,  then 
the  r*cetve_potm  identifies  the  procedure  to  be  invoked  and  an 
activation  to  immediately  crested  and  made  the  current  one.  The 
deblocking  of  the  Sender  to  then  th*  responsibility  of  the 
recei<e_peket  active tiou  which  should  toco*  a  SIGNAL  to  indicate 
reotopt  of  th*  ■sreags  If  *  RECEIVE  ha*  not  been  issued  and 
th*  CONNECT  dare  not  defta*  a  rex fee _joinl  then  the  IPC  will 
Impttdtfy  queue  the  ■imagt,  toning  the  reader  blocked,  until  a 
RECEIVE  to  tores d.  It  to  still  the  receiver's  responsibility  to  de¬ 
block  the  leader.  The  systems  events  that  occur  when  there  is  an 
outstanding  RECEIVE  are  discussed  below  when  we  describe 
RECEIVE. 
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The  token  i«  the  unique  identifier  of  the  message  and  ia  assigned 
by  the  IFC.  It  is  this  token  that  is  used  by  SIGNAL  to  indirectly 
deblock  the  sender  It  is  also  used  by  the  sender  to  later  deter¬ 
mine  the  (talus  of  a  submitted  message  (e.g  still  on  s  path, 
received,  etc).  Similarly,  if  a  RECEIVE  is  issued  without  a 
r*e*ivt_joim  specification  in  the  connect __peim,  then  the  receiver 
is  blocked  until  a  message  arrives  for  it. 

SIGNAL  is  then  simply  of  the  form: 

SIGNAL  token 

RECEIVE  is  similar  in  form  to  SEND: 


RE ( 'El i'£  token. lo _ pun, path, from _ port 


where  lio_pori,puth,from_rorll  refer  to  the  message  area  to 
receive  the  message,  the  pa'k  (or  indirection  for  the  message),  and 
the  sending  process  name  (optional).  Token  it  the  unique  identifi¬ 
er  of  the  transmitted  message,  returned  upon  successful  comple¬ 
tion  of  this  operation. 

If  path  ia  not  specified  in  either  RECEIVE  or  CONNECT  then  the 
from _ port  must  be  spocified.  In  such  a  cate,  the  receiver  is  ask¬ 

ing  for  a  message  from  a  specific  process  and  will  either  wait  or 

continue  asynchronously  (in  the  event  that  a  receive _ point  is 

specified  in  the  connect  point).  On  issuing  a  RECEIVE,  a  receiv¬ 
ing  process  will  get  a  message  if  a  message  ia  waiting  in  the  IPC 
mechanism.  If  there  ia  no  message  and  there  is  no  named 
recede _ point  procedure  associated  with  the  CONNECT,  the  proc¬ 
ess  will  be  blocked.  If  there  ii  a  named  receiw _ point,  the  ptocess 

will  be  permitted  to  proceed  asynchronously 

Similarly  if  the  from  port  is  not  specified  on  RECEIVE  or  CON¬ 
NECT.  then  the  polk  must  be  specified  folk  identifies  a  queue  or 
rile  that  the  receiver  is  willing  to  receive  messages  from  any  proc¬ 
ess  using  this  path.  The  receiver  will  be  able  to  receive  messages 
sent  to  either  this  path  or  to  the  pair  path,  to  port  -  receiving 
process  name.  The  receiver  can  not  receive  messages  sent  to  the 
polk  and  directed  to  another  proccaa,  ae  a  path  can  contain  mes¬ 
sages  directed  to  more  than  one  proceaa  from  more  than  one 
process. 

If  a  from _ po n  r  id  polk  arc  specified,  then  the  receiver  can 

receive  messages  sent  to  the  path  from  only  the  apecified  process. 

SUMMARY 

What  has  been  shown  iu  the  previous  two  sections  is  a  simple  set 
of  primitives  for  process  creation  and  inter-process  communica¬ 
tion. 

The  primitives  are  configuration  independent  and  do  not  inhibit 
the  installation  from  determining  the  appropriate  logical  systems 
structures. 

There  are  many  models  of  inter-process  communications  protocols 
which  differ  in  the  relation  of  SEND/RECEIVE  to  process  block¬ 
ing  and  concepts  of  WAIT,  etc.  They  also  differ  in  whether 
intervening  mechanisms  are  visible  to  communicating  processes, 
whether  mesas ge  collections  survive  process  destruction,  whether 
messages  may  he  queued  or  forced  upon  receivers,  and  in  conven¬ 
tions  for  the  concept  of  reply  and  response. 


This  paper  has  described  a  design  by  which  simple,  direct,  synch¬ 
ronous.  transient  interprocess  communication  may  be  undertaken 
without  recoverability  and  integrity.  As  part  of  the  same  concept, 
an  intervening  file  or  queue  may  be  imposed  which  allows  many 
to  many;  one  to  many,  one  to  any,  Interactions  across  protected 
paths.  The  concept  of  SIGNAL  ia  a  concept  of  response.  Replies 
art  teen  lo  be  undertaken  through  the  issuance  of  sands  at  the 
convenience  of  a  receiver  when  he  wishes  to  respond  in  a  mean¬ 
ingful  way  to  a  previous  message. 

The  notion  of  START  presented  by  this  paper  intends  to  provide 
n  mechanism  by  which  processes  can  initiate  other  processes  and 
call  for  execution  on  nodes  of  the  system  that  have  various  per¬ 
formance,  status,  toad  and  scheduling  attributes. 

A  paper  under  pieparation  discusses  various  aspects  of  the  struc¬ 
ture  of  an  operating  system  that  would  support  the  language 
conatructa  discussed  here. 
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Abitract 

A  reduction  language  is  a  functional  pro- 
graaaing  language  whose  seaantics  is  defined  by 
a  set  of  rewrite  rules. 

Our  paper  describes  the  architecture  of  a 
machine  which  directly  executes  reduction 
language  programs. 

A  laboratory  model  of  this  Reduction  Machine 
has  been  built  at  the  GMO  Bonn  and  is  currently 
used  for  experimental  prograa  design  based  on 
Berkling's  version  of  a  Reduction  Language. 


Reduction  language  machines  constitute  a  novel 
approach  to  computing  that  is  radically  different 
from  the  conventional  von  Neumann  concept. 

As  the  main  feature  of  reduction  languages  is 
their  strictly  functional  style  of  program  design, 
the  architecture  of  a  Reduction  Language  Machine 
cannot  be  understood  without  having  a  basic 
knowledge  of  the  language  constructs  and  their 
execution. 

There  already  exist  a  number  of  papers  dealing 
with  this  subject,  of  which  are  primarily  to 
mention  those  by  J.  Backus  [BACKUS  72  &  78]  and  by 
K.J.  Berkling  [BERKLING  76]  who  originated  the 
research  in  this  field,  and  by  F.  Hommes 
[HOMHES  77  A  79]  who  implemented  the  first 
simulation  model  of  a  Reduction  Machine. 

However,  it  is  thought  helpful  for  the  reader  of 
this  paper  to  be  briefed  on  the  Reduction  Language 
with  particular  emphasis  on  the  aspects  that  are 
relevant  to  an  appropriate  machine  organisation. 

The  paper  outlines  a  few  basic  Reduction 
Language  constructs,  their  rules  of  execution,  and 
the  machine  features  that  adequately  support  the 
processing  of  Reduction  Language  expressions. 

Then  we  give  an  overview  over  the  machine 
organisation  and  its  operating  principles,  and  a 
functional  description  of  a  hardware  model  of  the 
Reduction  Machine  which  has  been  constructed  at  the 
GMO  [KLUGE  79]. 


As  the  Reduction  Language  is  supposed  tc  permit  a 
strictly  functional  method  of  program  design,  its 
most  fundamental  construct  is  of  Ihe  form 

apply  function  to  argument 

The  components  of  this  expression  map  onto  a 
binary  tree  with  'function'  and  'argument' 
appearing  in  the  left  and  right  subtree, 
respectively,  and  with  the  'apply  to'  as  root  node; 

apply  to 
/  \ 

function  argument  Fia.1 

In  general,  'function'  and  'argument'  are 
non-trivial  tree-structured  expressions.  The’ 
'apply  to'  is  a  constructor  which  relates  two 
subexpressions  in  some  meaningful  way  to  each 
other. 

More  rigidly,  an  expression  e  of  the  Reduction 
Language  is  defined  as  e  :=  con  el  e2,  which  is  the 
preorder  notation  of  the  tree 

con 
/  \ 

el  e2  Fiq.2 

that  links,  by  means  of  the  constructor  'con',  two 
subexpressions  'el'  and  'e2'  to  each  other  to  form 
'e\ 

The  most  simple  expressions  are  atoms,  such  as 
primitive  function  symbols,  letter  strings  of  any 
finite  length  representing  variables,  or  strings  of 
decimal  digits  which  fora  decimal  numbers. 

Using  this  basic  structure  of  Reduction  Language 
expressions,  a  language  designer  would  have  to 
establish  a  set  of  primitive  functions,  data  types 
and  constructors,  which  must  be  complete  in  the 
sense  that  every  computational  problem  can  be 
formulated  by  a  systematic  application  of  these 
primitives. 

In  this  paper,  we  do  not  discuss  the  development 
of  such  a  complete  language  but  introduce  only  a 
particular  tree-processing  primitive  of  a  special 
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Reduction  Language  [HOMES  79]  to  show  the  basic 
operating  principle  of  the  aachine:  let  ’>'  be  a 
constructor  which  builds  binary  trees,  i.e.  •>  A  B' 
is  the  tree 

> 

/  \ 

A  B  Fid. 3 

with  'A*  as  left  and  'B'  as  right  subexpression. 

Let  'head'  be  a  priaitive  function  which  selects 
the  left  subtree  of  such  a  binary  tree,  i.e. 

apply  head  to  >  A  B 

results  in  'A';  this  transforaat  ion  of  an 
expression  to  another  expression  of  the  saae 
weaning  is  called  reduction. 


The  basic  aachine  functions  that  are  necessary  to 
execute  Reduction  Language  expressions  a  ay  be 
readily  derived  froa  what  has  been  said  about  the 
language  prlaitives  in  the  previous  section. 
Roughly  speaking,  there  aust  be  eeans  to 

-  represent  a  Reduction  Language  expression  in 
a  suitable  storage  aediua  so  that  its  tree 
structure  is  uniquely  exhibited; 

-  perfora  a  preorder  traversal  of  the 

expression  stored  within  this  aediia; 

-  recognize,  within  the  iaaediate  environeent 
of  the  actual  traversal  position,  the 
occirrence  of  a  reducible  subexpression; 

-  execute  the  reduction  according  to  the 

weaning  of  the  respective  priaitive 

expressions  (which  priaarily  Involves 

traversal  functions  such  as  the  coapariaon, 
deletion,  insertion,  and  copying  of 
subexpressions) ; 

-  resuae,  after  the  coaptetion  of  a  reduction, 
the  traversal  up  to  the  topaost  root  node  of 
the  expression  tree. 

The  first  two  probleas  were  solved  by  representing 
the  Reduction  Language  expressions  in  the  preorder 
notation  'con  el  eZ',  and  storing  thea  in  a 
push-down  stack,  with  the  root  node  syabol  on  the 
top:  so,  the  expression-tree 

apply  to 
/  \ 
head  > 

/  S 

A  B  F1g.« 

Is  represented  as  'ap(ply)  h(ea)d  (to)  >  A  B'  in 
preorder  and  stored  in  a  stack  as 


I  ap  I  hd  |  >  |  A  |  B 
- Fig, 5 


Since  the  preorder  traversal  schaae  requires 
that  the  root  node  is  inspected  first,  followed  by 
the  traversal  of  the  left  subtree  in  preorder, 
followed  by  the  traversal  of  the  right  subtree  in 
preorder,  it  siaply  takes  a  succession  of 
pop-operations  to  have  the  et^ression  eaerge  froa 
the  stack  in  the  desired  sequence,  with  the  item  on 
top  of  the  stack  being  the  actual  traversal 
position. 

A  SiNK-stack  aust  be  provided  into  which  all 
symbols  popped  out  of  the  first  SOURCE-stack  aust 
be  pushed  in  order  to  conserve  the  expression 
during  the  traversal.  The  expression  ending  up  in 
the  SiNK-stack  is  supposed  to  appear  with  the  root 
node  symbols  on  top  of  its  respective 
subexpressions.  To  accoapliah  this,  a  third  atack 
is  required  as  an  intermediate  storage  for 
constructors  since  they  eaerge  froa  the 
SOURCE-stack  ahead  of  their  subexpression  but  aust 
enter  the  SINK-stack  after  thea. 

The  corresponding  traversal  algori tha  brings 
about  the  following  phases  with  regard  to  tha 
contents  of  the  stacks  E  as  SOURCE-stack,  A  at 
SiNK-stack,  and  M  as  interaadiata  atack.  Initially, 
the  expression  resides  in  the  E-stack;  tha  stacks  A 
and  M  are  empty,  and  the  topaost  itea  on  E  is 
Inspected: 


As  the  itea  is  a  constructor,  it  la  transferred 
into  the  H-stack  and  marked  with  the  superscript 
'l'  which  indicates  that  tha  laft  subexpression  of 
this  constructor  is  now  going  to  be  moved  froa  tha 
E-stack  to  the  A-stack: 


The  focus  of  control  returns  to  the  top  of  the 
E-stack  and  aovat  tha  atom  'hd'  into  tha  A-stack: 


1  /s 


/ 


Then  the  focus  of  control  turns  to  the  M-stack. 
The  'ap'  on  top  of  the  M-stack  is  found  to  be 
marked  with  an  'l*;  as  its  left  subexpression  has 
just  been  moved  over  to  the  A-stack,  the  marking  is 
changed  to  'r',  indicating  that  now  its  right 
subtree  is  on  top  of  the  E-stack : 


A-stack 


E-stack 


M-stack 


The  top-element  of  the  E-stack  is  a  constructor  •>• 
which  is  put  into  the  M-stack  and  marked 
with  an  ' l' : 


The  constructor  'ap*  which  appears  now  on  top  of 
the  M-stack  is  found  to  be  marked  with  an  'r*.  As 
its  right  subexpression  has  just  been  moved  into 
the  A-stack,  the  constructor  'ap*  must  be  popped 
out  of  M  and  pushed  into  stack  A: 


'•■•I  I  A  |  0  |  >  | 


A-stack 


E-stack 


E-st  ack 


1 - 1  EiflJi 

This  completes  the  traversal  since  the  stacks  E  and 
M  are  empty  and  the  expression  is  lined  up  in  the 
SINK-stack  A  in  a  transposed  preorder  form,  with 
the  left  and  right  subexpression  interchanged. 

The  execution  of  the  same  traversal  algorithm  with 
A  as  SOURCE-  and  E  as  SINK-stack  reestablishes  the 
original  situation  shown  in  Fig. 6. 


There  are  two  important  things  that  need  to  be 

not  iced: 


M-stack  r 

ap 


p  Ifl.--.1S 


Then  the  left  subtree  'A*  of  ’>’  is  moved  into  the 
A-stack  and  the  constructor  '>'  is  marked 
with  an  'r' s 


A-stack 


E  -stack 


The  manipulation  of  the  stacx  contents, 
splits  into  two  phases.  First  the  item 
which  constitutes  the  focus  of  control,  the 
top  of  either  the  E-stack  or  the  M-stack, 
is  inspected.  Then  this  item  becomes  the 
subject  of  a  stack  operation,  which  is 
either  a  transfer  to  another  stack  or  a 
write-operation  on  the  same  stack. 

The  constructor’  on  top  of  the  M-stack 
controls  the  movement  of  its 
subexpressions;  moreover,  there  is  a 
situation  where  'ap'  is  on  top  of  the 
ll-star  k,  a  function  symbol  is  on  top  of  the 
A-stack,  and  the  argument  expression  is  on 
top  of  stack  E: 


H-stack 


After  the  atom  '£»'  has  been  moved  into  the  A-stack, 
the  constructor  •>•  is  found  to  be  marked  with  an 
'r*  and  can  be  pushed  into  the  A-stack  to  complete 
the  traversal  of  the  subtree  ’>  A  B': 


hd  A  B  > 


A-stack 


- n 


E-stack 


Fiq.f? 


A-stack 


1 - 1  Fig,  1.4 

This  property  of  the  traversal  scheme  serves  to 
recognize  reducible  expressions. 

In  our  example,  the  traversal  scheme  brings 
about  a  situation  where  'ap'  appears  on  top  of 
stark  M  and  the  function  'hd'  on  top  of  stack  A. 


.’ceils  d  d;. .  , 
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This  situation  aay  ba  readily  detected  by 
siaultsneously  watching  the  tops  of  the 
E-,'  A-  and  H-stack  during  the  execution  of  the 
preorder  traversal. 

If  an  instance  of  a  reduction  rule  occurs,  the 
traversal  is  imaediately  suspended  and  control 
switches  to  another  algorithm  which  performs  the 
appropriate  reduction  steps. 

The  reduction  algor i the  calls  other  algorithes 
which  participate  in  the  evaluation  of  the 
particular  subexpression. 

This  transfer  of  control  is  accoaplished  by 
conventional  methods  of  subroutine  stacking:  code 
words  representing  the  algorithes  are,  in  their 
order  of  activation,  pushed  into  a  systea  control 
stack  5,  and  popped  out  upon  teraination  so  that 
control  eventually  returns  to  the  original 
traversal  algorithm. 


The  red* ict  ion  of  an  expression  involves  rather 
siaple  priaitive  operations  like  the  deletion  of  an 
expression  which  aay  be  viewed  as  a  traversal 
without  a  SlNK-stack,  copying  which  is  a  traversal 
with  one  SOURCE-stack  and  two  SINk-stacks,  and 
comparison  which  is  a  traversal  with  two 
SOURCE-stacks  and  one  SINK-stack. 

For  instance,  the  reduction  of  the  expression  in 
Fig. 14  can  ba  done  as  follows:  first,  the  primitive 
function  'hd*  in  the  A-stack,  the  constructor  ’ap' 
in  the  M-staek  and  the  tree-constructor  ’>'  on  top 
of  ttio  F-stack  ;*re  deleted;  then  the  atom  'A'  ir. 
moved  to  thu  A--., tuck,  the  atom  'U'  is  deleted  arid 
‘A*  is  moved  back  into  the  E-Stack;  so, 
•apply  head  to  >  A  B’  is  reduced  to  *A'. 

Of  course,  this  procedure  also  works  properly  if 
•A’  end  'e'  art  not  only  atoms  but  trees. 


Other  important  algorithms  include  those  for 
performing  arithmetic  operations  on  decimal  numbers 
uf  any  f mite  length.  In  this  case,  two  atomic 
subexpressions  representing  the  operands  must, 
symbol  by  symbol,  be  popped  out  of  their  respective 
SOURCE-stacks  and  moved  through  an  arithmetic  un’t 
chose  output  is  pushed  into  the  SINK-stack. 


To  provide  sufficient  space  for  expression 
manipulation  it  is  convenient  to  have  more  than  the 
stacks  E,  A,  M  and  S  available:  so,  the  machine  has 
another  three  stacks  named  B,  U  and  V  to  store 
expressions. 


An  expression  is  manipulated  only  by  push,  pop, 
read  or  write  operations  affecting  the  items 
residing  on  top  of  the  stacks. 

There  is  no  addressing  of  objects  within  the 
expression  involved:  they  may  become  the  focus  of 
attention  only  through  «n  orderly  traversal  of  the 
expression  tr»e  which  brings  them  to  the  top  of  one 
ut  thu  stacks.  Addresses  are  used  only  to  identify 
the  stacks  that  are  to  be  operated  upon  in  a 
particular  instance. 


Controt  over  the  stacks  is  exercised  by  means  of 
the  Reduction  Unit  which  may  be  considered  as  the 
processing  unit  of  the  machine.  The  overall 
function  of  the  Reduction  Unit  is  very  siaple. 
Under  the  control  of  the  algorithm  residing  on  top 
of  the  system  control  stack  S,  it  inspects  the 
topmost  symbols  of  one  or  two  selected  stacks. 
Thereupon,  it  goes  through  a  decision  process 
(realized  by  combinatorial  logic  networks)  es  a 
result  of  which  it  may  issue  hew  symbols  and 
specify  stacks  which  are  to  be  pushed,  popped, 
written  into  and  read  next.  A  small  sequential 
network,  comprising  some  status  flipflops, 
navigates  the  machine  through  the  sequence  of 
actions  required  by  the  reduction  process. 

More  specifically,  the  Reduction  Unit  provides 
all  the  tacilities  to  perform  the  various  traversal 
algorithms,  to  recognize  Instances  of  reductions, 
and  1.0  execute  the  reductions.  Including  an 
arithmetic  unit  for  arithmetic  operations  on 
decimal  numbers. 

There  is  also  an  I/O-Processor  which  loads 
expressions  into  the  machine  and  unloads  them  after 
reduction,  and  via  which  the  user  may  exercise 
controt  over  the  machine. 

An  elementary  cycle  of  operation  within  the 
Reduction  Machine  quite  naturally  petitions  into 
four  phases  as  illustrated  below: 


phase  (1) 

phase  (2) 

Reduction  Unit 

Transfer  of 

analyses  stack 

- > 

symbols  from  the 

symbols 

Reduction  Unit 

to  the  stacks 

a 


v 


phase  (4) 

phase  (3) 

Transfer  of 

Operations 

symbols  from  the 

< - 

on  the 

stacks  to  the 

stacks 

Reduction  Unit 

Starting  in  phase  (1),  the  Reduction  Unit  is  about 
to  analyse  what  it  has  just  read  from  the  selected 
stacks.  Then  the  machine  enters  phase  (2)  during 
which  push,  pop,  read  and  write  control  signals, 
together  with  new  symbols,  are  transferred  from  the 
Reduction  Unit  to  the  stacks. 

During  phase  (5),  up  to  foir  stacks  can  be 
pushed  and  popped  such  that  new  symbols  appear  in 
their  topmost  positions  at  the  end  of  this  phase. 
During  phase  (4),  the  topaost  items  of  the  stacks 
which  have  been  selected  for  a  read  operation  are 
moved  into  the  Reduction  Unit  which  again  enters 
phase  (1). 
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rhe  hardware  model  of  the  Reduction  Machine  was 
primarily  intended  to  demonstrate  the  feasibility 
ot  the  Reduction  Language  principles.  Its  design 
was  largely  determinded  by  the  objective  of  getting 
a  simple  and  reliable  machine  into  operation  as 
quickly  as  possible. 

The  machine  employs  standard  low-power  Schott ky 
TTL  technology  for  all  logic  circuits,  registers, 
status-f l lpf lops  etc.,  fast  read-only-memory 
tlavices  for  the  realization  of  a  control  store  in 
which  the  reduction  algorithms  are  implemented,  and 
dynamic  random  access  memory  chips  for  t  tie 
realization  of  the  stacks. 


All  machine  operations  are  under  the  control  of 
a  central  clock  which  subdivides  a  machine  cycle 
into  eight  intervals  of  equal  length.  As  the  clock 
runs  at  a  frequency  of  6. 25  MHz,  an  interval  lasts 
160  nsec  and  a  machine  cycle  lasts  1.28Q 
microseconds.  The  effective  speed  of  operation, 
however,  is  slightly  slower  since  every  16th 
machine  cycle  is  used  for  a  refresh  operation  on 
all  stacks. 


A  block  diagram  of  the  hardware  architecture  is 
shown  in  Fig. 16.  It  comprises  the  Reduction  Unit 
(which  is  subdivided  into  four  subunits  named 
TRANS,  REOREC,  REDEX,  ARITH) ,  a  set  of  seven 
pushdown  stacks,  a  bus  system  which  handles  the 
traffic  of  symbols  and  control-signals  between  the 
Reduction  Unit  and  the  stacks,  a  central  timing 
system  CTS,  and  an  I/O-Processor  (a  conventional 
INTEL  SBC  80/20  single  board  computer)  which  also 
performs  some  monitoring  and  preprocessing 
funct  io.is. 

The  data  paths  within  the  entire  machine  are 
laid  out  to  accommodate  byte  formats  (eight  bits 
plus  parity),  i.e.  all  stacks,  data  busses  anu 
Reduction  Unit  circuits  are  one  byte  wide. 


The  Reduction  Unit  comprises  four  modules,  each 
of  which  is  accommodated  by  a  separate  printed 
circuit  board: 


-  TRANSport  performs  all  traversal  algorithms 
(including  deletion,  comparison,  copying); 

-  REOuction  RECognition  is  a  combinatorial 

logic  network  that  looks,  during  the 

traversal  of  an  expression,  for  the 

appearance  of  an  instance  of  a  reduction. 
Upon  the  detection  of  a  reducible  expression, 
REOREC  immediately  deactivates  the 
TRANS-subunit,  pushes  a  new  algorithm-code  on 
top  of  the  S-stack  and  turns  control  over  to 

-  REDuction  Execution,  which  essentially 

comprises  a  fast  control  memory  containing 
all  the  control  programs  which  are  required 
to  perform  the  reductions.  As  for  arithmetic 
operations,  REDEX  is  supported  by  the 


ARITllmetic  unit  which  performs  the  arithmetic 
operations  on  the  decimal  numbers  which, 
under  the  control  of  REDEX,  are  received 
digit  by  digit  from  the  respective  SOURCE 
stacks;  the  resulting  digits  are  sent  back  to 
the  SINK  stack. 
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Block  Diagram  of  the  Reduction  Machine  Architecture 

1  /B 


;>  A  stack  is  schematically  shown  in  Fig. 17.  The 

major  components  are  the  random  access  memory,  a 
stack  pointer  to  the  actual  top-of-stack  location, 
l  a  separate  TOP  OF  STACK  register  in  which  the 

actual  topmost  item  resides,  and  a  COPY-registe'  in 
‘  which  a  copy  of  the  contents  of  the  TOP-register  is 

i  held. 


I 


The  TOP-register  may  receive  a  data  item,  via 
the  multiplex  circuit  INBUSSELECT,  from  one  of 
three  sources:  the  KBUS,  the  LEWS,  or  the  memory 
celt  that  is  addressed  by  the  stack  pointer.  The 
contents  of  the  TOP-register  may  be  supplied  to  the 
KBUS  or  LBUS  via  the  multiplex  circuit 
OUTBUSSELECT. 

The  stack  operations  are  as  follows:  the 
TOP-register  contains  the  topmost  data  item  k  of 
the  stack,  a  copy  of  which  is  in  the  COPY-register; 
the  r.tackpoiiitor  addresses  the  first  empty  cell  of 
the  r..ndom  access  memory  stack  area. 
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Upon  a  push  operation,  an  item  enters  vim 
INBUSSELECT  from  KBUS  or  LBUS  and  is  written  into 
the  TOP-register.  Subsequently,  the  contents  of  the 
COPY-register,  i.e.  the  old  top  of  the  stack,  are 
stored  away  in  the  empty  cell  addressed  by  the 
stackpointer.  Afterwards,  the  stackpointer  is 
incremented  by  one  to  point  again  to  the  first 
empty  cell,  and  the  contents  of  the  TOP-register 
are  copied  into  the  COPY-register. 

Conversely,  if  the  stack  is  to  be  popped  up,  the 
stackpointer  is  first  decremented  by  one  to  point 
to  the  last  occupied  cell,  then  the  contents  of 
this  memory  cell  are  read  out  and  written  into  the 
TOP-register,  whose  now  value  is  copied  into  the 
COPY-register. 

Read  and  write  operations  affect  only  the 
contents  of  the  TOP-  and  COPY-registers  and  cause 
no  memory  access  cycle. 

Input/output  processing  end  certain  system 
support  functions  are  handled  by  a  conventional 
INTEL  SBC  80/Z0  single  board  microcomputer  which, 
via  a  tailoi — made  I/O- inter face,  is  attached  to  the 
bus  system  of  the  Reduction  Machine.  The  currently 
implei'.ented  I/O-conf  iguration  only  supports  a  data 
station  Hewlett  Packard  HI  ZMSA  which  perfectly 
suits  the  purpose  of  the  Laboratory  Model: 
Reduction  Language  expressions  can  be  edited, 
shipped  into  the  Reduction  Machine  for  the 
execution  of  a  user-specified  number  of  reductions, 
and  displayed  afterwords.  As  the  HP  2645A  data 
station  includes  two  tape  cartridge  drives,  user 
e/i  •  ions  and  standard  library  functions  may  be 
stored  away  to  and  retrieved  from  tape. 


Perspective 

When  assessing  its  strengths  and  weaknesses,  the 
Reduct  ion  Machine  architecture  and  its  hardware 
realization  as  described  in  this  report  should  be 
seen  in  the  tight  of  the  following  aspect*: 


-  the  Reduction  Machine  is  the  first  of  its 
kind  that  directly  supports  the  exocution  of 
reduction  languages;  its  architecture  has 
been  straightforwardly  derived  froa  the  basic 
structure  of  reduction  language  expressions 
und  their  rules  of  execution; 

-  the  concept  of  not  using  addresses  for  the 
representation  of  expressions  within  the 
Reduction  Machine  has  nowhere  been 

compromised; 


«  KBUS 


the  Reduction  Machine  was  primarily  conceived 
as  .-in  interactive  toot  for  systematic 
construction  of  functional  programs,  serving 
only  one  user  at  a  time; 
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Block  Diagram  of  the  Reduction  Machine  Stack 
Organisation 


the  hardware  model  was  siaply  intended  at  a 
vehicle  to  demonstrate  that  the  Reduction 
Language  concept  can  be  adequately  supported 
by  the  proposed  machine  architecture;  neither 
memory  capacity  nor  performance  in  terms  of 
program  runtime  were  a  design  objective.  s() 
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There  remain  a  number  of  problems  that  need  to 
be  solved  before  the  Reduction  Machine  can  be 
accepted  as  a  competitive  alternative  to  von 
Neumann  computers.  At  the  level  ut  machine 
architecture,  these  problems  concern 


-  adequate  interfacing  with  peripheral  memory 
devices  tike  disks  and  tapes  to  support 
program  libraries,  serious  data  base 
appl icat ions,  and  also  the  concept  01 
'virtual  stacks’,  i.e.  transparent  stack 
extension  into  secondary  storage; 

-  program-control  led  input  and  output  of 
expressions  from  and  to  peripheral  devices; 

interrupt  facilities  supporting  the 
coniiiiunicat  ion  with  l/O-Processors,  real  tune 
nppl  icat  tons  and  t  tie  cooperat  ion  with  other 
Reduction  Machines; 

-  measures  that  remedy  a  serious  performance 
degradation  in  list  processing  applications 
which  is  caused  by  excessive  copytng 
act ivi t ies. 


Preliminary  studies  have  shown  that  program 
controlled  I/O  and  interrupt  handling  can  neatly  be 
integrated  into  the  language  concept  by  introducing 
appropriate  constructs. 

As  it  appears  now,  the  interfacing  with 
conventional  peripherals  necessitates  traditional 
file  management  methods  and  data  transmission 
techniques  since  device  controllers  are  designed 
for  standard  interfaces  with  convent  lotul 
computers.  Hence,  the  microprocessor  approach  for 
l/O-handl  ing  wtiich  has  been  taken  with  tin 
Laboratory  Model  seems  to  be  a  step  into  the  right 
direction,  guided  by  the  type  of  peripheral  devices 
that  are  currently  available  in  the  market-place. 
HO'-ever,  with  future  advances  in  electomc  disk 
technologies,  stack-type  peripheral  memory  devices 
oi  sufficiently  large  capacity  that  are  compat  i.-ic 
with  the  internal  structure  of  the  Reduction 
Machine  may  be  anticipated. 

To  significantly  expedite  the  processing  of 
targe  list  structures,  t  he  hard-lined 
'no-addresses'  approach  ir-ay  have  to  be  softened  to 
some  extent.  Conceivably,  subexpress  ions  could  be 
linked  to  their  respective  constructors  by  relative 
pointers  within  the  internal  representation  of  an 
expression.  Along  these  pointers,  the  focus  of 
control  could  be  moved  directly  to  a  particular 
subexpression  rather  than  traversing  linearly 
through  the  expression  tree  that  is  to  the  left  and 
above  it. 

It  may  also  be  envisaged  that  such  a  pointer 
structure  facilitates  the  partitioning  of  an 
expression  into  subexpressions  of  suitable  meaning 
that  can  be  distributed  for  concurrent  processing 
within  a  system  of  cooperating  K  duct  n.m  Machine. 
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AN  EXPRESSION  ORIENTED  EDITOR  FOR  LANGUAGES  WITH  A  CONSTRUCTOR  SYNTAX 
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Postfach  1Z40,  Schloft  Birtinyhoven 


Abstract 

A  wide  class  of  languages  can  be  defined  by  using  a 
constructor  syntax.  This  paper  gives  a  short  in¬ 
troduction  to  the  constructor  syntax  and  describes 
on  interactive  editing  systea  for  languages  having 
such  a  syntax.  In  contrast  to  conventional  line- 
oriented  editors  this  editing  systea  is  completely 
expression-oriented.  The  systea  has  been  success¬ 
fully  implemented  for  Berkling’s  Reduction  Machine. 


Backus  introduces  in  his  report  CBACKUS  73] 
languages  with  a  constructor  syntaxi  The  pair  <A,K) 
i$  a  constructor  syntax  for  a  language  E  if  the 
following  conditions  hold: 

1.  A  c  E 

i.  Each  k  e  K  is  a  function  from  a  subset  Sk  of 
E"  into  E 

3.  For  every  e(E,  either  •  €  A  or  there  are  a 
unique  ken  and  unique  eu...,e„  c  L  such 
that  kCel#...,e„]  -  e. 

Each  element  a  of  A  is  called  on  atom,  and  each 
k  e  x  is  called  a  constructor.  Let  k[e»,...,e„]  ■ 
e,  then  et/...,en  are  called  subexpressions  of  the 
expression  e  and  k  is  called  an  n-place- 
constructor.  Each  expression  of  a  language  with  a 
constructor  syntax  is  either  an  atoa  or  can  be 
writ  ton  ns  e  *  k[et/...,e„ ]. 

Example  1:  Definition  of  a  language 

Let  A  *  {*!,,,,•,)  and  K  ■  {kj,k,)  with 
ki  €  [ExE  — >  E],  i.e.  the  ki's  are  two-place- 
constructors.  Then  (A,K)  is  a  constructor  syntax 
defining  a  language  which  we  call  E  end  to  which 
wf  will  refer  in  the  following  chapters.  An 
example  of  an  expression  of  the  language  E  is 
kiti!i/ki([ai#k|[aa.ktCs4»St]]]] 

Expressions  of  languages  with  a  constructor  syntax 


can  be  represented  as  trees.  Atoms  become  the 
leaves  of  the  trees,  whereas  the  constructors  form 
the  nodes. 

/  \ 

■> 

/  V 

•i  k, 

/  \ 

a,  k, 

/  \ 

•<  •« 

Fig.  it  Tree-representation  of  the  expression  of 
example  1. 

The  language  E  which  has  been  dafinad  in  Example  1 
looks  very  abstract,  for  we  did  not  assoc lata  any 
meaning  with  the  atoms  or  constructors,  we  just 
gave  them  formal  names. 

Now  we  are  going  to  discuss  the  following  two 
representations  of  the  language: 

1.  Its  representation  within  a  machine  (machine 
interlace  or  intwmt  J3MaC«MntlUflH> 

Z.  Its  representation  on  a  display  station  (user 


A  representation  of  an  expression  within  a  given 
machine  is  obtained  by: 

1.  coding  the  atoms  and  constructors 

Z.  mapping  the  structure  of  the  expressions  into 

storage 


Each  atom  or  constructor  is  storsd  within  s  memory 
cell;  the  coding  function  maps  the  symbolic  name  of 
an  atom  or  constructor  into  a  value  which  fits  into 
a  memory  item,  e.g.  e,  is  sapped  to  ths  hexadecimal 

constant  X’35'. 

In  the  following  we  will  denote  the  coding  of  an 
atom  or  a  constructor  x  by  ix,  i.e.  the  symbolic 
name  for  the  coding  of  the  atoa  a,  is  $a(. 


1  Hi 


Ua  have  already  mentioned  that  each  expression 
can  be  represented  by  a  tree;  this  means  that  we 
have  to  aap  a  tree-structure  into  memory.  A  con¬ 
venient  way  to  do  that  is  to  connect  the  elements 
by  pointers.  Figure  2  shows  such  a  realization  for 
the  expression  defined  by  Example  li 
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*x\  I 

|-> 

tk.l  I 

. 
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-> 

*a. 
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M 

M 
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$a4 

Fig.  2:  Representation  of  an  expression  by  using 
pointers. 

Berkling  has  used  another  method  within  his  Reduc¬ 
tion  Machine;  the  expressions  are  stored  within 
stacks,  using  the  preorder  notation  of  the  asso¬ 
ciated  express ion-tree.  Figure  J  shows  how  the  ex¬ 
pression  of  Example  1  is  stored. 

Top  of  stack 


|$kx  |$a,  |$ki  |$a,  |$k,  |$a,  |$k,  |$a  J$a, 


Fig.  3;  Representation  of  an  expression  in  a  stack 
using  preorder  notation. 

In  this  paper  we  will  prefer  the  stack  representa¬ 
tion  since  it  has  the  following  advantages; 

1.  The  representation  is  very  close  to  the 
formal  definition  of  expressions,  i.e. 
removing  brackets  and  commas  from  the  formal 
definition  leads  directly  to  the  preorder  no¬ 
tation  (cf.  Example  1  and  Figure  3). 

2.  It  is  free  of  pointers  which  are  not  directly 
related  to  the  problem. 

Thus  the  algorithms  of  the  editing  system  which  we 
are  going  to  describe  will  be  more  clear  and  pre¬ 
cise,  for  they  are  free  from  pointer  manipulation 
and  garbage  collection  problems. 


3-  External  Representation 

Normally  the  user  is  not  interested  in  the  Internal 
coding  of  an  expression.  He  wants  to  see  certain 
keywords  or  strings  which  have  a  meaning  to  him. 

Therefore  we  need  another  function  -  the 
I/O-f unction  -  mapping  formal  expressions  Into  ex¬ 
pressions  which  can  ba  undarstood  by  the  user.  The 
I/O- function  can  be  defined  by  a  table  which  asso¬ 
ciates  all  atoms  with  a  string  and  all  constructors 
with  a  prototype-expression  that  consists  of  some 
keywords  and  place-holders  (□>  which  indicate  where 
the  subexpressions  are  going  to  be  inserted.  In 
Figure  4  a  possible  I/O-table  for  the  language 
defined  in  Example  1  is  shown: 


a;  «  head|a,  »  tail 

a,  “  *  B|a,  =  C 

kj  -  apply  d 
to  D 

k,  ■  >  D  0 

Fig.  4:  I/O-table  for  the  language  given  by  Exam¬ 
ple  1  (translation  to  Berkling's  Reduction 
Language) . 

Using  the  I/O-table  above  the  expression  of  Exam¬ 
ple  1  is  displayed  as; 

apply  head 

to  apply  tail 

to  >  A  >  B  C 

which  is  a  valid  expression  for  Berkling's  Reduc¬ 
tion  Language. 

Different  1/0-tables  may  exist  for  the  same 
formal  language.  The  next  figure  shows  a  transla¬ 
tion  of  formal  expressions  to  LISP: 


a,  *  CAR  |a,  «  CDR 

a,  "Ala*  «  B|a,  *  C 

kj  *  (  □  □  > 

l  k,  •  (  □  .  a  ) 

Fig.  S:  I/O-table  for  the  language  given  by  Example 
1  (translation  to  LISP). 

Using  this  table  results  in;  (CAR(C0R(A. (B.C) ) ) ) 

The  editing  system  which  we  are  going  to  develop 
will  only  be  based  on  the  formal  definition  of  ex¬ 
pressions.  The  external  representation  of  an  ex¬ 
pression  is  generated  by  using  an  I/O-table,  which’ 
may  be  a  default  table  supplied  by  the  system,  or  a 
table  defined  by  a  user  who  wants  to  use  his  own 
external  representation  of  a  language. 

The  next  figure  shows  the  relationship  between 
tho  different  representations  of  an  expression: 
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formal 

expression 
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coding 
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system 

internal 

representation 

Fig.  6:  Representations  of  format  expressions. 


u-  The-Interactlve  Editing  system 


1-  An  Expression  Oriented  Editing  System 

Conventional  editors  are  line-oriented,  i.e.  a  line 
is  the  smallest  logical  unit.  Almost  all  commands 
of  such  an  editor  refer  to  lines,  e.g.  move  lines-, 
copy  lines,  insert  lines,  scroll  up  and  down  a  cer- 


UV*L. 


TYPE  STACKNAME  »  (E,A,N,B,U); 


tain  number  of  lines.  Thara  is  no  relationship  to 
the  structure  of  the  program  which  is  edited,  or  to 
the  language  it  is  written  in.  If  a  user  wants  to 
delete  a  begw-end-block,  ha  has  to  find  the  cor¬ 
responding  lines  for  delation.  This  can  be  very 
tedious  if  nested  blocks  are  used  or  the  block  does 
not  fit  onto  the  display. 

We  have  iapleaented  an  editor  which  is  not  line- 
oriented  but  expression-oriented,  i.a.  the  smallest 
logical  unit  the  user  can  handle  is  a  complete  ex¬ 
pression.  All  comsands  of  the  editor  will  refer  to 
expressions,  e.g.  copy  expressions.  Move  ex¬ 
pressions,  delete  expressions,  scroll  to  a  sub¬ 
expression  etc. 


Then  the  algorithm  TRANSPORT  is  given  by 

PROCEDURE  TRANSPORT <X,Y: STACKNAME) j 
BEGIN 

CASE  TOP(X)  OF 

ATOM:  MOVE (X, Y) ; 

If -CONSTRUCTOR:  BEGIN 

MOVE(X,N) > 

FOR  I:«1  TO  N 
DO  TRANSPORT  <X,Y) / 
NOVE(M,Y) j 
END 

END 

END 


Since  the  smallest  logical  units  in  our  editing 
system  are  complete  expressions,  ue  first  describe 
some  basic  algorithms  which  allow  us  to  move,  copy 
and  to  delete  expressions. 

The  editor  works  with  five  stacks  which  are 
called  E,  A,  M,  B,  and  U.  Expressions  are  stored 
within  svacks  using  the  representation  described  in 
1.2  (cf.  Figure  3). 


EDITOR  -  logic 


TRANSPORT  is  a  recursive  algorithM:  after  a  con¬ 
structor  has  been  saved  in  the  H-atack  all  ita  tub- 
expressions  are  moved  to  the  sink-stack,  then  the 
constructor  is  moved  from  the  M-stack  to  the  aink- 
stack. 

£io_tS;  During  the  transport  the  subexpressions  of 
constructors  are  interchanged,  a.g.  applying  the 
algor ithm  TRANSPORT  to  the  expression  shown  in 

Figure  3  yields: 

Top  of  stack 

l*k:  liki  |tk, |$k, |$a, |$a4 |$a, |ia, |*ix 


[stack] 


fig.  7:  Memory  used  by  the  Editor. 


There  are  the  following  primitive  procedures  and 
functions  to  handle  stackelements: 

f'OP(X):  deletes  the  item  on  top  of  stack  X 


Fig.  8:  Result  of  transporting  the  expression  given 

by  Figure  3. 

Applying  the  algorithm  transport  repeatedly  to  an 
expression  yields  the  following  transformation: 

TRA.  TRA. 

K(es  ,...,e„>  — >  K(e„,...,ex)  — >  K(ex,...,e„> 

i.e.  an  even  number  of  transports  always  yields  the 

original  expression. 


PUSH (I ,X) :  pushes  item  I  into  stack  X 


MOVE (!• , Y) :  moves  one  item  from  stack  X  to 

stack  Y 

M0VE2 TX,Y,2) :  moves  one  item  from  stack  X  to  the 
stacks  Y  and  Z. 

The  functions  and  procedures  listed  above  will  not 
he  explained  any  further  in  this  paper. 


The  algorithm  TRANSP0RT2  moves  an  expression  from 
one  stack  to  two  other  stacks.  It  is  called  by 
TRANSPORT? (X,Y,Z>,  where  X,  Y,  and  z  are 
stacknames:  l  denotes  the  second  stack  to  which  the 
expression  is  moved.  The  algorithm  differs  in  only 
one  point  from  the  algorithm  TRANSPORT:  atoms  and 
constructors  are  pushed  into  two  sink-stacks. 


The  algorithm  TRANSPORT  moves  a  complete  expression 
from  one  stack  to  another  stack.  A  third  stack,  the 
control-stack  M,  is  used  for  Intermediate  saving  of 
constructors.  A  call  of  the  elgorithm  TRANSPORT  is 
denoted  by  TRANSPORT (X,Y)  where  X  and  Y  are 
stacknames  and  the  expression  is  moved  from  the 
stack  x  to  the  stack  Y,  i.e.  TRANSPORT <E, A)  moves 
an  expression  from  stack  E  to  stack  A. 

In  the  follouing  we  will  use  a  PASCAL-tike 
language  to  specify  algorithms.  Ue  assume  that 
then  <•  is  a  global  type-declaration  of  the  stacks; 


PROCEDURE  TRANSP0RT2 (X,Y,Z: STACKNAME) ; 
BEGIN 

CASE  TOP(X)  OF 

ATOM:  MOVE? (X,Y,Z) j 
N-CONSTRUCTOR:  BEGIN 

MOVE(X,H); 

FOR  I:-1  TO  N 
00  TRANSP0RT2 (X, Y,Z) ; 
MOVE? <H,Y,Z>; 

END 

END 

END 


2.3. 


Stack  E 


Stack  B 


Display- 

buffer 


This  algorithm  copies  an  expression  from  one  stack 
to  another  stack  without  interchanging  the  sub¬ 
expressions.  COPY  has  the  same  parameters  as  the 
algorithm  TRANSPORT,  i.e.  COPY(A,E)  copies  an  ex¬ 
pression  from  stack  A  to  stack  E.  nzeO  The 
algorithm  COPY  uses  the  algorithms  TRANSPORT2  and 
TRANSPORT: 

PROCEDURE  COPY(X,Y:STACKNAME) ; 

VAR  21,22:  STACKNAME; 

BEGIN 

Z1:«  ...;  ZZ:«  ...; 

TRANSPORTH (X,Z1 ,Z2)  ;  ' 

TRANSPORT  <21  ,X)  ; 

TRANSPORT (Z2,Y); 

END 

The  stacks  Z1  and  Z2  are  used  as  scratch  pad  stacks 
for  expressions.  They  must  be  different  from  the 
stacks  X  and  Y. 

2.4.  The  .Atqor1,*hB_B£U;TS 

The  algorithm  DELETE  removes  a  complete  expression 
from  a  stack.  It  has  only  one  parameter  which  is 
the  name  of  the  stack  where  the  expression  is  to  be 
deleted,  i.e.  DELETE (E)  deletes  an  expression  in 
stack  E: 

PROCEDURE  DELETE (XiSTACKNAME) ; 

BEGIN 

CASE  TOP (X)  OF 

ATOM:  POP(X); 

N-CONSTRUCTOR:  BEGIN 

POP (X) ; 

FOR  I:»1  TO  N  DO  DELETE (X); 
END 

END 

END 

The  algorithms  described  above  have  been  imple¬ 
mented  by  hardware  in  Berkling's  Reduction  Machine. 
A  description  la  given  in  [KLUGE  79]. 


Fig.  9:  output  of  an  expression. 

The  algorithm  OUTPUT  is  a  modified  TRANSPORT- 
algorithm.  It  is  defined  by 

PROCEDURE  OUTPUT; 

BEGIN 

CASE  TOP (B)  OF 

ATOM:  DISPLAY; 

N-CONSTRUCTOR:  BEGIN 

DISPLAY; 

FOR  I;«1  TO  N  DO  OUTPUT; 

END 

END; 

ERROR:  BEGIN 

DELETE (B) ;  ABBREVIATE; 

END; 

END 

The  procedure  DISPLAY  pope  one  item  out  of  stack  B, 
retrieves  its  representation  froa  the  I/O-table 
(cf.  1.3),  and  replacas  the  associated  placeholder 
within  the  display-buffer  by  the  representation.  , 
Before  the  algorithm  OUTPUT  is  called,  the  display 
buffnr  is  cleared  and  one  placeholder  is  inserted. 
Figure  10  shows  the  different  states  of  the 
algorithm  OUTPUT,  using  the  1/0-tuble  given  in 
Figure  4  and  the  expression  Maj  ,kj[ak,U|  ]]: 


Stack  Display-  Stack  Display 

B  buffer  B  buffer 


Ue  have  already  mentioned  that  the  editor  works 
with  five  stacks  which  are  called  E,  A,  M,  B,  and 
U.  Stack  E  contains  the  expression  which  is  dis¬ 
played  to  the  user.  We  call  this  expression  Focus 
of  Attention  <FA).  Stack  M  is  the  control  stack 
which  is  used  by  the  TRANSPORT-algor Ithm.  Stack  B 
is  used  for  input  and  output,  i.e.  input  operations 
move  an  expression  from  the  display  station  to 


The  procedure  DISPLAY  fall*  If  there  it  not  enough 
space  within  the  display  buffer  to  insert  the 
representation  of  an  atoe  or  a  constructor.  In  this 
case  the  error  exit  is  taken:  The  corresponding 
subexpression  in  stack  8  is  deleted  and  the 
a Igor i the  ABBREVIATE  replaces  the  current 
placeholder  by  an  abbreviation  syebol.  This  Means 
that  the  inneraost  subexpressions  are  autoeatically 
abbreviated  and  the  conplete  Focus  of  Attention  is 
shown  on  the  display. 

5.  Scrolling  and  Displaying  Selected  Subexpressions 

In  this  chapter  we  will  describe  how  the  user  can 
change  the  Focus  of  Attention  in  order  to  look  at 
subexpressions  which  have  been  abbreviated.  Line- 
editors  can  display  hidden  inforaation  by  aeans  of 
scrolling  commands:  Oisplay  previous  page,  display 
next  page,  scroll  up  n  lines  etc.,  i.e.  scrolling 
is  completely  line-oriented.  For  our  purpose  ue 
need  a  scrolling  aechanisa  which  is  expression- 
oriented  since  the  hidden  information  aluays  con¬ 
sists  of  complete  subexpressions. 

But  the  problem  is  how  to  select  a  subexpression 
on  the  display  and  how  to  find  the  corresponding 
subexpression  within  the  expression  In  the  E-stack. 
The  solution  we  are  looking  for  should  be  indepen¬ 
dent  of  the  current  I/O-table  that  is  used  for  dis¬ 
playing  expressions;  it  should  only  depend  on  the 
constructor  syntax. 

An  easy  way  to  select  a  subexpression  is  to  move 
the  ctrvor  to  Us  position  on  the  display.  But 
cursor-addresses  are  not  expression  dependent,  they 
are  just  given  by  a  line  and  column  number.  So  we 
have  to  translate  the  cursor-address  into  an 
appropriate  expression-address.  In  our  editing  sys¬ 
tem  these  'appropriate'  addreases  are  themselves 
expressions  takon  from  a  special  address  language 
LABOR.  The  language  LABOR  is  dafinad  by  the 
following  constructor-syntax: 

Let  A  ■  (1,2,...)  U  (nil),  i.e.  an  atom  is 
•ithar  a  natural  number  or  nil,  and  let  K  • 
(K2ADDR)  where  K2AD0R  is  a  two-placa  constructor. 
Then  the  editor  will  use  the  following  expression 
of  IADDR  as  address  of  expressions  or  sub¬ 
expressions: 

1.  The  root  of  an  expression  gats  the  address 
nil 

2.  The  i'th  subexpression  gets  the  address 
K2AOOR[I,AOOR]  where  AOOR  is  the  address  of 
the  current  expression 

In  order  to  make  addresnet  mors  raadabla  we  use  the 
following  representation  for  tha  conatructor 
K2A00R:  K2A00R[X,Y]  -  X.Y 

The  next  figure  shows  thd  expression  of  Figure 
10,  uhere  each  subexpression  has  been  marked  with 
its  address. 

“i 

(nil) 

/  \ 

•i  h 

fl.nil)  (2. nil) 

/  \ 

*4 

(1.2. nil)  (2. 2. nil) 

Fig.  11:  Expression  and  its  addresses. 


Bote:  All  addresses  terminate  with  the  special  atom 
nil.  Readers  who  are  familiar  with  LISP  probably 
have  noticed  that  express ion -addresses  era 

represented  by  a  list  of  integers.  At  expretsion- 
addresset  art  based  on  a  constructor  syntax,  the 
basic  algorithms  TRANSPORT,  COPY,  etc.  may  aleo  be 
applied  to  thee.  Besides  these  we  will  need  some 

other  algorithms  to  handle  addresses: 

HEAO(ADOR):  extracts  the  first  nuaber  from  an 

address,  i.e.  HEAD(1.2.S.4.nil)-1 

TAIL (ADOR):  removes  the  first  number  from  an 

address,  i.e.  TAIL(1.2.3.4.nil)  » 
2. 3.4 .ni l 

reverse (ADOR) :  reverses  the  sequence  of  the 

numbers  that  constitute  an 

address, i.e.  REVERSE (1.2. 3. A. ni l) 
-  4. 3. 2.1. nil 

These  algorithms  can  be  expressed  by  using  the 
basic  transport  algorithm  and  the  operations  POP, 
PUSH,  and  HOVE. 

Given  a  reversed  expression-address,  we  can 
define  an  algorithm  SCROLLDOUN  which  selects  the 
corresponding  subexpression.  Basically,  SROLLDOUN 
is  a  transport-algorithm  which  moves  an  expression 
from  stack  E  to  stock  A,  but  the  transport  is 
stopped  as  soon  as  the  selected  subexpression  is  on 
top  of  stack  E: 

PROCEDURE  SCROLLDOUN (ADOR:  EXPRESSION-ADDRESS); 

BEGIN 

IF  NOT (ADOR  •  nil) 

THEN  BEGIN 

HOVE (E,H) ; 

WHILE  I  <  HEAD (AOOR) 

DO  BEGIN  TRANSPORT (E, A);  It-I+1;  END; 

SCROLLDOUN (TAIL (AOOR) ) ; 

END 

END 

When  the  SCROLLDOUN  algorithm  stops,  the  stacks  A 
and  H  contain  the  environment  of  the  selected  sub¬ 
expression.  stack  H  contains  sll  the  constructors 
which  hsve  been  encountered  whan  walking  to  the 
subexpression,  whereas  stack  A  contains  all  the 
subexpressions  which  have  been  removed  in  order  to 
get  the  subexpression  on  top  of  stack  E. 

After  having  salected  a  subexpression  snd  aftsr 
having  performed  some  actions  on  it  the  user  nay 
wont  to  return  to  the  expression  from  where 
scrolling  was  Invoked.  This  is  done  vie  the 
algorithm  SCROLLUP  which  is  the  inverse  of  the 
algorithm  SCROLLDOUN: 

PROCEDURE  SCROLLUP (AOOR:  EXPRESSION-AOORESS); 

BEGIN 

UIIILE  I  <  HEAD  (ADOR) 

DO  BEGIN  TRANSPORT (A,E);  I: -1+1;  ENO; 

MOVE  <H,E) ; 

SCROLLUP (TAIL (ADOR) )  ; 

ENO 

Before  the  algorithm  SCROLLUP  is  called  the 
expression-address  it  not  reversed.  SCROLLUP  moves 
the  subexpressions  and  constructors  having  been 
moved  to  the  stacks  A  and  H  back  to  stuck  E,  thus 


reconstructing  the  original  expression  again. 

The  editing  aystea  also  supports  nested 
scrolling:  Whenever  the  algorithm  SCROLLDOWN  is 
called  the  associated  expression-address  is  moved 
to  stack  U.  A  sequence  of  scroll-downs  then  gener¬ 
ates  a  sequence  of  expression  numbers  within  stack 
U.  When  scroll-up  is  requested  the  required 
expression-address  is  found  on  top  ct'  stack  U  from 
where  It  is  removed. 

Now  there  is  one  problem  left:  Which  expression- 
address  belongs  to  which  cursor-address?  This 
relationship  is  established  via  algorithm  OUTPUT 
which  is  extended  in  the  following  way:  Por  each 
atom  and  for  each  constructor  the  corresponding 
expression-address  generated: 

PROCEDURE  OUTPUT (ADDR:  EXPRESSION-ADDRESS); 

BEGIN 

CASE  TOP(B)  OF 

ATOM:  DISPLAY 
N-CONSTRUCTOR:  BEGIN 

DISPLAY; 

FOR  I:*1  TO  N 
DO  OUTPUT (I .ADDR) ; 

END 

END 

ERROR:  BEGIN 

DELETE (B); 

ABBREVIATE; 

END 

END 


When  algorithm  OUTPUT  is  called  for  the  first  time 
ADDR  should  be  nil,  i.e.  OUTPUT(nil)  is  a  valid 
call. 

The  procedure  DISPLAY  of  algorithm  OUTPUT  has  to 
be  extended,  too.  First  of  all  we  need  in  addition 
to  the  display  buffer  a  second  buffer  uhich  we  will 
call  address  table.  The  address  table  has  as  many 
entries  as  characters  can  be  displayed  on  the  dis¬ 
play.  Each  entry  contains  the  address  of  an 
expression-address.  Now,  the  procedure  DISPLAY  will 
update  both,  the  display  buffer  and  the  address 
table:  Whenever  the  representation  of  an  item  is 
moved  to  the  display  buffer,  the  corresponding 
entries  within  the  address  table  will  receive  the 
address  of  the  current  expression-address.  Figure 
12  shows  the  contents  of  the  address  table  for  the 
different  phases  of  algorithm  OUTPUT  for  the  exam¬ 
ple  given  in  Figure  10. 
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where:  1  — >  nil  4  — >  1.2. nil 

2  —  >  1  ,ni  t  S  — >  2. 2 .ni  l 

J  — >  2. nil 

Fig.  12:  Contents  of  the  address  table  for  Figure 

10. 


No  entry  means  that  at  the  corresponding  position 
of  the  display  no  expression  is  shown.  The 
algorithm  ABBREVIATE  will  Insert  the  address  of  the 
expression  number  of  the  abbreviated  expression 
into  the  address  table. 

Now  we  are  able  to  translate  a  cursor-address 
into  en  express ion-eddressi  The  cursor-address 
denotes  an  offset  within  the  address  table,  where 
we  find  the  address  of  the  associated  expression- 

address: 


ADDRESS  TABLE 


CURSOR- 

ADDRESS 


— > 


> 


EXPRESSION- 

ADDRESS 


Fig.  IS:  Association  of  cursor-  and  expression- 
address 

The  existence  of  an  address  table  allows  an  ex¬ 
pression  oriented  use  of  some  standard  display- 
station  keys,  e.g.  the  EOF-key  <*  Erase  until  end 
Of  Field)  can  be  changed  to  a  more  useful  EOE-key 
<=  Erase  until  end  Of  Expression).  An  expression  is 
erased  by  erasing  all  screen-positions  whoso 
expression-addresses  have  the  same  suffix  as  the 
current  address  of  the  expression. 


6.  Editing:  Update  of  Expressions 

Until  now  we  have  described  the  passive  part  of  the 
editing  system,  e.g.  the  representation  of  ex¬ 
pressions,  how  they  are  displayed  etc.  Now  we  turn 
our  attention  to  the  active  part  of  the  system 
v.nich  allows  the  user  to  edit  (=  update,  delete, 
tx place,  etc.)  expressions. 


6.1.  Format  of  the  Screen  Image 

First  of  all  we  have  to  specify  the  screen  image 
us ed  bv  the  editor.  The  screen  of  a  display  should 
hr  divided  into  four  logif.il  pari  \  .r.  '.Iioun  m 

l  UJIHO  Vi. 


f  HI. 


EXPRESSION-field 

FA-field 

HESSAGE-f ield 

COMHAND-field 

Fig.  U:  Logical  fit  Id*  UMd  by  the  editing  system 

The  FA-field  i*  the  area  in  which  the  current  Focut 
of  Attention  i*  displayed  via  a Igor  it ha  OUTPUT.  In 
the  COMHAND-field  the  user  way  specify  editor- 
coaaands.  The  MESSAGE-f laid  it  used  to  display 
additional  information  like  error  passages,  ex¬ 
planations  of  the  coaaands  etc. 

The  EXPRESSION-field  is  used  to  tpdate  ex¬ 
pressions.  Figure  IS  shows  how  the  four  fields  can 
be  mapped  onto  the  screen  of  a  real  display- 
station.  This  display  image  is  used  by  the  editing 
system  of  Bcrkl tug's  Reduction  Machine. 


DELETE  (E)>  PUSH  <EMPTY-EXPRESSION,E) ; 

The  copy-coaaand  copies  CE  either  to  an  auxiliary 
stack  (X  •  STACKO,  STACKi ,  etc.)  or  into  an  ex¬ 
pression  library  (x  •  name  of  an  aiqgression) . 
Copying  is  done  in  the  following  way:  At  first  the 
expression  is  copied  to  the  I/O-stack  B  and  free 
there  it  is  transported  to  the  desired  destination: 

COPY(E,B>;  TRANSPORT (B,X) j 

A  list  of  the  coemand*  used  by  the  reduction 
machine  editor  is  given  in  [HOMMES  7*J. 

6.3.  updating  expressions 

Updating  expressions  in  an  expression-oriented 
editor  weans:  replace  a  subexpression  by  wwsther 
subexpression.  This  is  always  done  in  the  seme  way: 


) 


E  ->  EXPRESSION-field  C  »  I COMMAND- f 1 eld 


MESSAOE-field 


FA-field 


Fig.  IS:  Display  image  used  by  the  editing  system 
of  Berk  ling's  Reduction  Machine. 


1.  The  user  enters  an  expression  into  the 
EXPRESSION-field  and  positions  the  cursor  to 
the  expression  in  the  FA-field  which  he  wants 
to  replace. 

2.  Via  algorithm  SC ROLL DOWN  the  expression  which 
in  going  to  be  replaced  is  brought  to  the  top 
of  stack  E. 

S.  Via  algorithm  INPUT  the  new  expression  is 
generated  from  the  old  expression,  the  pro¬ 
gram  library,  the  auxiliary  stacks,  and  the 
input  specified  by  the  user. 

*.  The  expression  to  be  replaced  is  deleted. 


6.2.  Ed  it or -Commands 

Uo  have  already  mentioned  that  the  Focus  of  Atten¬ 
tion,  i.e.  the  expression  which  resides  on  top  of 
stack  E,  is  displayed  within  the  FA-field  of  the 
screen.  Now  let  us  consider  a  subexpression  of  FA 
which  is  given  by  the  current  cursor-position.  Ue 
will  call  this  expression  the  CURSOR-expression 
(CE)  and  denote  Its  address  by  CEADOR. 

All  editor  commands  refer  to  CE,  this  means  that 
CE  oust  be  on  top  of  stack  E  when  the  specified 
command  is  going  to  be  executed.  Thus  we  have  to 
perform  a  SCROLLDOUN  before  and  a  SCROLLUP  sfter 
the  execution  of  a  command: 

SCROUDQMN  (REVERSE  (CEADDR) ) 
execute  specified  monitor  commend 
SCROLLUP (CEADDR) 

In  this  paper  we  will  give  only  the  description  of 
two  basic  editing  commands: 

0:  Delete  the  CURSOR-expression 
cx:  copy  the  CURSOR-expression 

The  D-command  replaces  CE  by  a  special  atom  called 
the  EHPTY-exprest 1 on  which  prompts  the  user  to  en¬ 
ter  a  new  expression.  This  ensures  that  a  user  can 
nevar  ganerata  incomplete  expressions.  Whenever  he 
delates  an  expression  he  has  to  replace  it  by 
another  expression.  The  algorithm  for  the  D -command 

>  S: 


S.  The  new  expression  is  moved  from  stack  B  to 
stack  E.  A  scroll-up  operation  is  performed 
to  return  to  the  previous  FA,  which  now  in¬ 
clude*  the  replaced  subexpression. 

Figure  16  shows  the  contents  of  the  atacks  during 
the  different  phases  of  replacement: 
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Fig.  16:  contents  of  stacks  whan  replacing  an  ex¬ 
pression. 
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outline  of  procedure  REPLACE: 

PROCEDURE  REPLACE (ADOR:  EXPRESSION-AOORESS) ; 

BEGIN 

SCROLLDOUN (REVERSE (ADOR) ) ; 

INPUT; 

DELETE (E); 

TRANSPORT (B, A); 

TRANSPORT (A, E)  ; 

SCROLLUP (ADOR) ; 

END 

The  algorithm  INPUT  perform*  the  following  opera¬ 
tions: 

1.  The  expression  specified  by  the  user  in  the 
expression  -field  is  translated  from  its  ex¬ 
ternal  representation  to  the  associated  in¬ 
ternal  representation  by  using  the  I/O-table. 

2.  EHPTV-expressions  are  inserted  for  missing 
subexpressions,  I.e.  the  expression  entered 
by  the  user  is  automatically  completed. 

3.  References  to  other  expressions  are  resolved. 

4.  When  INPUT  terminates,  a  complete  expression 
has  been  generated  within  stack  B. 

There  are  three  references  to  other  expressions 
which  may  be  used  when  constructing  new  ex¬ 
pressions: 

express 1 on-address: 


always  denoted  by  an  axpression  having  the 
following  format: 


w  f  ai  ...  t,  or  /  I  \ 

f 

i.e.  a  special  constructor  cal lad  applicator 
followed  by  a  function  and  its  arguments. 

Progress  written  in  an  applicative  language  are 
executed  by  resolving  applications,  i.e.  by 
applying  functions  to  its  arguments,  which  is  done 
according  to  e  set  of  rewriting  rules.  A  rewriting 
rule  specifies  the  expression  by  which  an  applica¬ 
tion  is  to  be  replaced. 

Example:  The  rewriting  rule  for  the  identity  func¬ 
tion  is  given  by 

ap 

/  \  — >  e 

id  e 

An  algorithm  which  resolves  applications  can  be 
based  on  a  TRANSPORT-algorithm.  The  idea  is  to  move 
an  axpression  from  stack  E  to  stack  A,  but  to  stop 
the  transport  when  the  following  situation  occurs: 
the  applicator  is  on  top  of  stack  M,  the  function 
is  on  top  of  stack  A  and  the  arguments  are  on  top 
of  stack  E.  Then  the  application  is  resolved  ac¬ 
cording  to  the  rewrite  rule,  i.e.  the  applicator  is 
popped  ouf  of  stack  M,  the  function  is  removed  from 
stack  A,  and  the  arguments  on  top  of  stack  E  a^e 
replaced  by  the  result. 


An  expression-address  is  replaced  by  its  cor¬ 
responding  expression,  i.e.  the  expression- 
address  l.nil  may  be  uaed  to  refer  to  the  first 
subexpression  of  the  axpression  which  is  going 
to  be  replaced. 

Example:  Entering  l.nil  will  replace  an  ex¬ 
pression  by  its  first  subexpression 

name  of  an  auxiliary  stack  or  of  an  expression: 

The  name  is  replaced  by  a  copy  of  the  expression 
which  is  either  on  top  of  an  auxiliary  stack  or 
in  the  expression  library.  This  reference  is 
used  to  retrieve  expressions  which  are  moved  to 
an  auxiliary  stack  or  to  the  library  by  using 
the  copy-command. 

Note:  Expression  references  ere  resolved  by 
applying  the  basic  COPY-algorithm. 

This  chapter  hat  shown  the  basic  features  of  an 
expression-oriented  editing-system;  [HONMES  79] 
gives  more  information  and  shows  especially  how  the 
user  can  construct  programs  in  such  an  expression- 
oriented  system. 

7.  Evaluation  of  Programs 

The  editing  system  described  so  far  works  for 
arbitrary  languages  based  on  a  constructor  syntax. 
Now  we  are  going  to  restrict  this  class  of 
languages  to  applicative  languages.  These  are 
languages  in-which  an  application  of  a  function  is 


-sill-.  ’rji*. 


Stack  Stack  Stack 
E  A  M 


Stack  Stack  Stack  . 
E  A  M 


Fig.  17:  Resolyfhg  an  application 


Having  readlved  the  application  the  TRANSPORT 
algorithm' is  activated  again.  Uhen  the  expression 
has  been  moved  to  stack  A  all  application*  have 
been  resolved.  The  algorithm  TRANSPORT (A,E)  moves 
the  expression  back  to  stack  E. 

The  editor  can  be  easily  extended  to  allow  in¬ 
teractive  execution  of  expressions  or  sub¬ 
expressions.  Introducing  the  editor-command  E 
<=*Evaluate)  into  the  environment  described  in 
II. 6. 2.  wi ll  result  in: 


<k_4 

1 

1 


SCROLLDOUN (REVERSE (CEADOR) ; 
EVALUATE; 

TRANSPORT (A,E) ; 

SC  ROLLUP (CEADOR); 


By  using  the  cursor  any  subexpression  may  be 
selected  for  evaluation. 
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Abstract 

By  uging  u  functional  programming, 
system  as  a  machine  language,  a  highly 
parallel  computer  can  be  runs  true  led. 

A  form  of  lazy  evaluation,  using  in¬ 
complete  objects,  provides  a  merino  Ism 
for  constructing  a  data  flow  computer 
which  directly  executes  programs  wiilten 
using  the  functional  program  system  in 
a  highly  parallel  manner.  Since  a  data 
flow  architecture  Is  used,  this  paral¬ 
lelism  is  not  dependent  on  any  specialized 
parallel  language  or  compiler.  This 
computer  consists  of  three  basic  components: 
a  set  of  processors,  a  shared  memory 
containing  only  FP  objects,  and  a  queue 
feeding  functions  tu  all  processors.  The 
design  i«  modular,  jllowlng  an  arbitrary 
number  t  processors,  which  need  nut  be 
identical. 


INTRODUCTION 

A  new  approach  to  data  flow  computers  is 
suggested  by  functional  programming  (FP)  systems, 
as  described  by  Backus  [1],  By  introducing  a  form 
of  "lazy  evaluation",  similar  to  that  used  by 
Friedman  and  Wise  [3]  In  a  computer  whose  machine 
language  la  an  FP  system,  a  simple  yet  powerful 
data  flow  computer  results. 

Unllks  other  parallel  computers,  data  flow 
processors  (2,4, 5, 6]  obtain  parallelism  directly 
from  lta  aource:  the  natural  data  dependencies 
between  operations  In  e  program.  Such  computers 
are  not  bound  to  parallel  languages  or  compilers, 
but  are  able  to  introduce  parallelism  into  all 
programs  without  need  of  assistance  above  the 
hardware  level. 

FUNCTIONAL  PROGRAMING  SYSTEMS 

This  section  will  serve  as  a  .  ‘fresher  on  FP 
systems  and  as  a  reference  fot:  later  discussion  of 
FP  systems.  Only  those  aspects  of  FP  systems 
relevant  to  computer  design  will  be  reviewed.  A 
complete  description  of  the  FP  system  used  here 
can  be  found  in  Backue  [1], 

An  FF  system  is  described  by  five  things:  a 
set  of  primitive  functions,  a  Bet  of  functional 


forms,  a  set  of  definitions,  and  t he  operation  of 
application.  Formal  systems  for  functional  pro¬ 
gramming  (FFP  systems)  use  objects  to  represent 
FP  functions. 

An  object  is  either  an  atom,  a  sequence  whose 
elements  are  objects,  or  1  ("bottom"  or  "undcf  ined"). 
Atoms  Include  numbers  and  Identifiers,  FP  systems 
whose  sequence  constructor  la  1  preserving  will 
never  allow  1  to  be  an  element  of  u  sequence. 

Only  in  an  FP  system  whose  sequence  constructor  is 
not  1  preserving  could  the  sequence  cX,i’  be.  found. 
The  special  atom  41  la  used  to  denote  the  empty 
sequence,  -  ■  * ch  is  both  an  atom  and  a  sequence. 
Gequencas  - ;  be  represented  by  enclosing  live 
sequence  ele...jntn  in  <  and  >,  The  application 
operation  Is  denoted  by  a  so  the  application  of 
the  function  f  to  the  object  x  would  be  written  as 
f :  x. 

All  functions  are  applied  to  11  single  object. 
Since  all  functions  have  only  one  argument,  it  is 
unnecessary  to  give  names  to  arguments.  Because 
all  programs  are  composed  only  of  such  f  unc:. ton" , 
all  variable  names  are  completely  eliminated. 
Functions  which  would  normally  require  more  than 
one  argument  are  applied  to  a  sequence  containing 
all  of  the  needed  arguments.  A  brief  list  of  the 
primitive  functions  to  be  used  follows. 

>l:x  Where  n  is  an  integer.  Find  the 

>;th  element  of  the  sequence  x. 

tlsx  Remove  the  first  element  of  the 

sequence  X. 

Idix  The  identity  function.  Return  x 

unchanged. 

utom:x  Tests  If  x  is  an  atom.  T  is  return¬ 

ed  for  true,  F  for  false. 

eq:<x,y>  Tests  if  x  and  y  are  equal  objects, 

null:*  Tests  if  x  is 

reverse:x  Reverse  the  elements  of  the  sequence 

x. 

dlstr:<s.x>  Create  a  sequence  of  pairs  formed  by 
pairing  each  element  of  s  with  x, 
<si,x>. 

distli<x,s>  Like  dlstr,  except  the  pairs  will 
have  x  fox  the  first  element, 

<x,s1>. 

length:x  Find  the  length  of  a  sequence. 
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+;<x,y>  Add  x  and  y.  (-,  x,  and  :  ar«> 

similar.) 

and:<x,y>  And  the  booleana  x  and  y.  (Or  and 
not  functions  are  similar.) 

irans-.x  Transpose  x,  where  x  is  a  sequence 

of  sequences  Identical  in  length. 

apndl:<x,eeq>  Append  x  to  the  left  end  of  seq. 
apndr : <seq, x>  Append  x  to  the  right  end  of  seq. 

apply:<f,x>  Apply  the  function  f  to  the  object 
x . 

A  i  is  produced  whenever  a  function  in 
applied  to  an  improperly  formed  object,  such  as 
applying  a  selector  to  an  atom  or  using  a  sequence 
in  place  of  a  number  for  an  arithmetic  operation. 

All  functions  are  i  preserving,  returning  1  when 
applied  to  j,.  (But  tee  the  discussion  later  of 
non  i  preserving  functions). 

Functional  forms  are  functions  which  use 
other  functions  or  objects  as  parameters.  Forms 
are  used  to  create  expressions  involving  functions 
The  functional  forms  to  be  used  are: 

f*g:x  Compose  f  and  g.  Returns 

f:(g:x). 

If, . fn)tx  Coru.<  :uct  a  sequence  whose  1th 

elekient  Is  fjix, 

(p^figjix  If  p:x  Is  T  return  f:x,  other¬ 

wise  If  p:x  is  F  then  return 
gsx. 

y:x  Return  y,  a  constant, 

/f:x  Insert  e  binary  function  Into 

a  sequence. 

/fi<x1>  S  xii/f:<x1,...,xn> 

=  f:<xi,/ft<X2,...,xn». 

af:<x1,...,xn>  Apply  a  function  to  all  ele¬ 
ments  of  a  sequence. 

(while  p  f):x  While  p:x  ■  T  apply  f  to  x 

The  state  D  contains  all  functions  defined  by 
the  user.  A  function  definition  aasoclatee  a 
name  (an  atom)  with  a  function.  Definitions  are 
denoted  by  def  name  =  function.  All  function 
names  must  be  either  defined  In  D  or  known  by 
the  system  as  primitive  functions  or  forms.  Since 
D  never  changes  during  the  execution  of  a  program, 
the  set  of  functions  defined  for  a  particular 
program  la  static. 

THE  IMHERPfT  PARALLELISM  IN 
AH  FP  SYSTEM 

The  FF  forma  which  directly  Imply  paral¬ 
lelism  are:  apply- to-all  (a),  Insert  (/),  and 
construction  ((...)).  Apply-to-all  creates  a 
sequence  by  applying  the  same  function  to  a  var¬ 
iety  of  objects,  while  construction  creates  a 
sequence  by  applying  a  variety  of  functions  to  the 
same  object.  Within  these  forms  function  eval¬ 
uations  can  proceed  In  parallel,  due  to  the 
absence  of  side  effects. 


The  inner t  form  computes  a  single  result  ab¬ 
sorbing  each  element  of  a  sequence  into  a  dyadic 
operator.  If  the  operation  being  Inserted  Is 
associative  (i.o..  If  f :<A, (f :<B,C>)  “f :<(f :<A,B>), 
C>  for  all  objects  A,B  and  C)  then  this  form  can 
be  highly  parallel.  An  associative  Insert  can 
"tree  In"  the  sequence  rather  than  proceeding 
aerially  through  the  sequence.  Associative  func¬ 
tions  would  be  recognized  before  program  execution 
and  two  different  Insert  forms  would  be  used: 

Insert  and  Insert-associative. 

An  interesting  property  of  these  forms  Is 
that  if  a  parallel  construction  form  la  implement¬ 
ed  ,  then  parallel  versions  of  the  Insert-associa¬ 
tive  and  apply-to-all  forms  can  be  expressed  using 
parallel  construction.  Assuming  that  the  function 
is  being  applied  to  the  pair  <function,  obJect> 

(the  result  of  [2*1, 2]  in  en  FFP  system) .  then 
suitable  definitions  for  the  apply-to-all  and 
insert-associative  functions  are: 

def  APPLYTOALL  =  null*2*$; 

apndl* (apply •(),  3*2], 
APPLYT0ALL*(I,tl*2]] 

def  INSERTASSOC  =  eq* (length.2,I]-*l *2j 

INS  ERT ASSOC • REDUCEPAIRS 

def  REDUCEPAIRS  =  leq*(length*2,I]-*ld; 

[1 , apndl* (apply* (1 , (J*",2*Z] ] , 
2*REDUCRPAIRS* [1 ,tl*tl*2) ] ] 

The  function  nsme  leq  is  used  for  s  lese- 
than-or-equal-to  function.  For  the  apply-to- 
all  function,  if  both  the  apply  and  APPLYTOALL 
arguments  to  ths  apndl  function  art  avaluatad  In 
parallal,  then  eventually  each  application  will 
be  running  In  parallal.  In  the  caaa  of  INSEKT- 
ASSOC,  the  function  REDUCEPAIRS  will  apply  the 
function  being  Inserted  to  successive  pslra  In  the 
sequence,  halving  tha  length  of  the  sequence. 

This  will  be  done  in  parallel,  as  with  APPLYTOALL. 
The  INSERTASSOC  function  iteratively  calls 
REDUCEPAIRS,  which  trees  in  the  sequence  one  level, 
until  the  final  rssult  (the  top  of  the  tree)  le 
reached. 

PARALLELISM  IN  COMPOSITION 

Introducing  parallelism  Into  the  composition 
form  is  more  difficult.  The  nature  of  composition 
would  ssan  to  prohibit  any  sort  of  parallelism 
due  to  tha  Inherent  data  dependency  between  the 
functions  being  composed.  If  It  Is  required  that 
the  data  transferred  between  the  functions  Is  on 
object  In  ths  usual  asnsa,  then  petal lei lea  Is  in 
fact  Impossible.  If,  however,  s  function  is  able 
to  form  partial  results,  then  these  results  can  bs 
passed  between  the  functions  allowing  aome  degree’ 
of  overlap.  These  partial  results  sties  from  ths 
ability  to  decompose  (or  factor)  many  functions. 

To  sxprsas  partial  results  incomplete  objects 
will  bs  introduced.  An  Incomplete  object  le  an 
object  containing  portions  which  have  yet  to  be 
determined,  but  which  eventually  will  be  filled 
in.  Tha  PP  system  requires  only  one  new  "object" 
to  express  these  incomplete  objects,  the  Incomplete 
atom  m,  w  will  serve  as  ths  fundamental  unit  of 
Incompleteness,  capable  of  assuming  any  value  on 
completion.  An  w  can  be  thought  of  as  s  place- 


holder,  representing  the  result  of  en  erbltrery 
function  which  hes  not  yet  been  finished. 

Every  u  will  be  eeeocleted  with  s  completion 
function.  This  completion  function  will  eventuel- 
ly  specify  e  velue  to  be  used  In  piece  of  the  w. 
Formally,  eny  u>  should  be  Identified  by  its  com¬ 
pletion  function.  A  more  casual  notation,  In 
which  ui's  with  different  completion  functions 
will  be  given  different  subscripts,  will  be  used 
herein.  Of  course,  there  mey  be  many  references 
to  the  result  of  a  single  completion  function. 

When  ui  is  used  as  a  sequence  in  an  eppend 
function,  a  new  sort  of  Incomplete  object  la 
created.  If  apndl:<X,uii>  Is  evaluated,  the 
result  will  be  denoted  by  <X,fl,>.  fl  is  called 
the  Incomplete  subsequence,  and  Is  used  to  Indi¬ 
cate  a  section  of  a  sequence,  of  arbitrary  length, 
which  hes  not  yet  been  filled  in.  In  this  example, 
and  have  the  seme  completion  function,  yet 
the  result  of  the  completion  function  will  be  In¬ 
stalled  within  a  sequence  In  the  case  of  For 

example,  If  tl,  (and  (1)3)  complete  to  <Y,Z>,  the 
sequence  will  now  be  <X,Y,Z>,  not  <X,<Y,Z». 

All  11' s  will  ba  found  within  a  sequence.  Any 
time  that  an  (1  completes  to  a  non-sequence,  an 
error  (j)  Will  result.  An  (1  Is  not  a  separate 
Incomplete  etom,  but  rether  e  different  usage  of 
the  basic  Incomplete  atom  ui.  Any  will  be 
dependant  on  some  for  Its  completion  function. 

If  ai  appears  within  a  sequence,  it  represents  a 
particular  element  of  the  sequence  whose  velue  is 
as  yet  unknown,  but  If  (1  appears  In  a  sequence,  It 
represents  a  portion  of  the  sequence  Itself  which 
is  unknown.  Any  sequence  containing  SI  will  be 
termed  an  Incomplete  sequence;  any  object  contain¬ 
ing  either  u  or  11  will  be  termed  an  Incomplete 
object. 

Conceptually,  an  Incomplete  object  Is  a  set 
of  objects.  This  set  contains  all  possible  values 
the  Incomplete  object  may  assume  on  completion. 

For  example,  w  would  be  the  set  of  all  objects, 

<«>  would  be  the  set  of  all  sequences  (Including 
4>) ,  <U)^,U)2>  would  be  the  set  of  all  sequences 
of  length  2,  and  so  on.  A  partial  ordering  of 
Incomplete  objecta  can  be  constructed  using  the 
containment  relation  between  their  associated 
seta.  An  Incomplete  object,  X,  la  more  complete 
than  another  Incomplete  object,  Y,  If  the  set  of 
objects  associated  with  X  Is  a  proper  subset  of 
the  set  associated  with  Y.  A  complete  object  is 
one  whose  set  contains  only  one  member,  the  object 
Itself . 

When  a  function  is  applied  to  an  Incomplete 
object,  four  different  situations  may  arise: 

1.  The  object  is  not  sufficiently  complete 
for  the  function  to  have  any  effect.  In  this 
case,  the  function  must  be  deferred  until  the 
object  becomes  more  complete. 

2.  The  function  can  be  applied  to  portions 
of  the  object,  but  must  defer  applying  Itself  to 
other  sections  of  the  object. 

3.  The  function  can  be  applied  to  the  object, 
but  the  result  is  still  Incomplete. 

4.  The  function  can  be  applied  to  the  object 
and  the  result  Is  a  complete  object. 

A  few  Illustrations  of  these  cases  are: 

1.  +:<3,ui.>  cannot  be  evaluated  (at  this 

instant) . 


2.  Reverse :  <A ,  B.flj  ,  D.  E>  ■  <E,D,  (reverse: 
<fl)‘>)»  B,A>  •  <E,D,f)2,B,A>,  where  a  new  11)2  has 
been  created  to  hold  the  result  of  (reverse: 

<n1>). 

3.  3:<A,B,ti)1>  »  Trans :<<ojj,ai2>, 

<a)3,U^>>  ■  <<ui,W3>,<a)2i“A>>. 

A.  3:<ai1,B,C>  •  C.  Length :<toi,ii)2>  ■  2. 

A  rather  subtle  problem  has  arisen  here.  By 
postponing  the  completion  of  a  sequence,  the  j 
preserving  nature  of  the  sequence  constructor  has 
been  lost.  For  example.  If  2:<A,u1>  Is  evaluated 
to  A,  this  result  becomss  incorrect  if  uj  is 
completed  by  1  and  the  sequence  constructor  Is  1 
preserving.  Thus,  It  Is  natural  for  an  FP  system 
which  uses  Incomplete  objects  to  have  a  sequence 
constructor  which  Is  not  _[  preserving,  prevent- 
jng  entire  sequences  from  bexng  later  replaced  by 

(To  further  allow  parallelism,  it  would  be 
osslble  to  produce  other  functions  which  are  not 
preserving.  An  example  of  auch  a  function  would 
be  the  and  function.  If  and  Is  defined  so  that 
its  result  Is  F  (false)  If  either  element  of  the 
pair  it  is  applied  to  Is  F,  then  and:<F,u1>  could 
be  immediately  evaluated  to  K.) 

Incomplete  objects  are  closely  related  to 
the  suspensions  produced  In  lazy  evaluation  [3]. 
One  difference  Is  that  Incomplete  objects  Imply 
concurrent  function  evaluation  while  suspensions 
imply  delayed  function  evaluation,  Another  Is 
that  conceptually,  Incomplete  objects  stay  within 
the  realm  of  objects  (with  only  u>  added),  while 
suspensions  are  used  transparently.  The  real 
advantage  in  using  incomplete  objects  rather  than 
suspensions  lies  in  the  clean  notation  of  Incom¬ 
plete  objects  and  the  ability  to  stay  within  the 
set  of  objects. 

THE  DESIGN  OF  AN  FP  COMPUTER 

The  design  goalB  of  the  FP  computer  will  he: 

1.  The  computer  will  use  an  FP  system  as  a 
machine  language. 

2.  The  memory  will  be  used  only  for  FP 
objects. 

3.  The  computer  will  be  data-driven; 
parallelism  will  result  naturally  from  data 
dependencies. 

4.  The  computer  will  be  modular,  allowing 
great  expansion  without  any  change  in  the  basic 
architecture . 

Goal  1  provides  a  computer  which  will  enforce 
a  disciplined  use  of  the  memory  at  the  hardware 
level,  preventing  destructive  updating  and  side 
effects.  Goal  2  allows  the  memory  to  be  homogen¬ 
eous.  Since  only  objects  are  being  stored,  the 
memory  is  not  forced  into  trie  conventional  work 
and  address  structure.  Goal  3  attempts  to  produce 
an  ideal  data  flow  computer  by  putting  the  burden 
of  parallelism  onto  the  hardward.  Goal  4  states 
that  the  design  should  be  expandable,  allowing 
great  increases  in  computing  power  without  chang¬ 
ing  the  underlying  architecture. 

Incomplete  objects  will  be  used  to  preduce  the 
necessary  parallelism.  Two  basic  principles  will 
govern  the  use  of  Incomplete  objects.  First,  all 
functions  will  be  completion  functions.  This 
associates  each  function  with  a  place  (an 
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incomplete  atom)  for  its  result.  The  eecond  prin¬ 
ciple  le  that  Incomplete  atoms  will  be  generated 
by  the  function  apply.  Thla  includes  the  use  of 
apply  in  most  functional  forma.  For  example, 
f*g:x  mould  be  treated  as  f:(g:x),  so  that  two 
Incomplete  atoms  would  be  used,  one  for  the  result 
uf  g:x  and  the  other  for  the  result  of  f:(g:x). 

The  FP  computer  will  have  three  basic 
components:  A  set  of  processors,  a  memory,  and  a 
READY  queue.  The  processors  apply  functions  to 
objects,  the  memry  holds  these  objects,  and  the 
READY  queue  feeds  functions  to  the  processors. 

The  READY  queue  functions  as  a  "shared  pro¬ 
gram  counter".  All  functions  evaluated  by  the 
processors  must  flow  through  tha  READY  queue. 
Whenever  a  function  is  ready  to  be  executed,  it  is 
placsd  into  the  READY  quaque.  A  queque  element 
(instruction)  has  four  components.  The  format  of 
a  queue  element  is: 

< function,  object,  ur„u^t*I>> 

The  function  and  object  describe  an  application 
to  be  performed,  uraault  indicates  the  atom  bsing 

completed,  and  D  is  the  state  of  the  program.  D 
will  be  constant  for  all  queue  elements  of  «:  sin¬ 
gle  progt'x.  In  a  multiple  program  environment, 
differs::'.  igrama  could  be  distinguished  by  their 
different  D  s. 

The  memory  contains  only  objects.  Objects 
Include  queue  elements,  D's,  functions,  and  incom¬ 
plete  atoms.  The  memory  must  be  managed,  allowing 
new  objects  to  be  created  and  removing  objects 
which  h've  become  garbage.  When  an  incomplete 
atom  is  identified  as  garbage,  its  completion 
function  must  be  terminated.  Since  it  la  impor¬ 
tant  to  remove  these  garbage  functions  as  soon  as 
possible,  garbage  should  be  Identified  lonedlately 
when  produced. 

All  Incomplete  atoms  will  have  an  attached 
queue,  similar  to  ths  READY  queue.  These  queues 
will  contain  functions  which  are  blocked  by  an 
input  which  is  not  sufficiently  complete.  When¬ 
ever  a  function  cannot  evaluate,  it  atteches  it¬ 
self  to  an  Incomplete  atom  blocking  it.  When  an 
incomplete  atom  is  completed  (actually,  it  still 
can  be  replaced  by  an  Incomplete  object,  but  it 
will  always  become  more  complete),  It9  queue  ts 
atluched  to  the  READY  queue. 

The  processors  take  queue  elements  from  the 
READY  queue  and  execute  them.  Figure  1  gives  a 
simplified  flowchart  of  processor  operation.  Three 
distinct  paths  exist  through  this  flowchart:  one 
for  garbage  functions,  one  for  functions  blocked 
by  incomplete  objects,  and  one  for  functions  which 
arc  executed.  Processors  have  three  sorts  of 
functions  to  deal  with:  built  in  functions,  de¬ 
fined  functions,  and  forms.  Built  in  functions 
have  some  standard  representation  recognized  by 
the  processors;  defined  functions  are  fetched 
from  the  state.  1);  and  forms  are  handled  through 
the  raetacompositton  rule.  All  inter-processor 
romnunlratlon  Is  handled  by  the  READY  queue  and 
memorv.  No  special  inter-processor  communication 
hardware  is  required.  Also,  no  processor  has 
any  state  saved  between  instructions. 


MULTIPLE  PROCESSOR  TYPES 

The  architecture  can  be  expanded  to  accomaxj- 
date  different  types  of  processors.  Tha  only 
addition  needed  ie  a  READY  queue  for  each  processor 
type.  When  a  queue  element  le  ready  for  execution, 
it  ie  pieced  Into  the  READY  queue  corresponding 
to  the  function  within  the  quaue  element.  Thla 
allows  a  system  to  uaa  a  smaller  number  of  proces¬ 
sors  for  functions  which  are  costly  to  Implement 
or  Infrequently  used.  Also,  a  high  speed  arith¬ 
metic  procesaor  would  not  be  tied  up  executing  non- 
arlthnetlc  functions. 

Ons  vsry  useful  processor  type  would  be  a 
procaaaor  which  only  checks  for  executable  func¬ 
tions  (functions  whose  object  It  sufficiently 
complete  to  allow  execution  of  the  function).  Thla 
very  simple  processor  would  rmmova  this  burden 
from  processors  with  computing  abilities. 

COMPARISON  WITH  OTHER  DATA  FLOW  COMPUTERS 

A  broad  definition  of  a  data  flow  processor 
[6]  la  ona  In  which  the  execution  sequence  is  con¬ 
trolled  by  data  dependencies.  Many  data  flow 
computers  require  that  a  modal  for  the  partial 
ordering  of  tha  execution  sequence  be  constructed 
before  execution,  nt  a  lima  when  data  dependencies 
cannot  be  conplately  located.  The  FP  computer, 
however,  needs  no  such  modal  since  data  dependen¬ 
cies  are  manifested  Curing  program  execution. 
Furthermore,  the  FP  computer  allows  specialized 
processors  and  progrma  control  la  not  directed  from 
a  tingle  master  procaaaor. 

The  use  of  a  FP  system  for  a  machine  language 
induces  elngle  assignment  behavior  [S],  which  la 
alto  found  In  pure  LI8P  [3,4] .  FP  systems  provide 
a  more  practical  machine  language  than  LISP  [3,4] 
since  FP  systems  do  not  use  a  changing  environment 
or  variable  names.  Selectors  are  much  more  suit¬ 
able  for  accessing  values  at  the  machine  level  than 
names . 

The  placing  of  a  queue  element  into  the  READY 
queue  corresponds  to  firing  [Z]  but  the  FP  computer 
does  not  know  If  the  queue  element  Is  actually 
ready  for  execution.  An  FP  operation  may  "fire" 
several  times,  each  time  waiting  for  a  more  com¬ 
plete  input,  until  the  operation  is  finally  per¬ 
formed. 

The  overhead  Involved  with  parallelism  lies 
in  encountering  functions  which  are  found  to  be 
unexecutable  due  to  an  insufficiently  complete 
object.  This  overhead  is  usually  limited  for  a 
particular  function,  since  only  a  limited  number 
of  stageB  of  completion  are  possible  for  objects. 
For  example,  the  +  function  normally  will  see  a 
maximum  of  only  3  stages  of  completion  of  its 
argument,  such  as  <lllj,  u>2>><lllj  •n2>  ><nj  • 

IMPLEMENTATION  OF  THE  FP  COMPUTER 


This  section  outlines  those  features  of  the  FP 
computer  which  relate  to  parallel  processing. 

The  Functional  Forms 

Composition:  Composition  uses  an  Incomplete 
atom  ti>  link  the  functions  being  composed.  When 

<f°g,x,u>  ,  _  D>  is  executed,  a  new  Incomplete 

"  result,  ’ 

atom,  m  ,  is  created.  The  function  g  is  started 


'  temp’ 
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by  placing  <8«x«wtaBp.D>  in  the  READY  queue.  The 

function  £  is  placed  in  the  queue  attached  tu 

w  ,  in  the  fora  of  the  queue  element  'f,u  , 
temp  1  temp 

10result'*>>'  *oon  as  8  Pr°duces  its  first  par¬ 

tial  result,  f  will  attempt  tu  proceed. 

C .instruction;  Construction  forms  a  sequence 


of  Inccmplete  atoms. 


When  <lfj,  ...  , f n ] 


o  .  ,D>  is  executed,  a  result  ,  <ui,  ...  ,ui  >, 
result  1  n 

is  Immediately  formed.  Also,  for  each  f.  the 
queue  element  <f,,x,u, ,D>  is  added  to  the 
READY  queue.  1 

Apply-to-alli  The  only  difference  between 
construction  and  apply-to-all  is  that  apply-to- 
all  may  be  applied  to  an  incomplete  sequence.  If 


<af,<x^,  ...  ,flt . 


,x  >,u>  ,  D>  is  executed, 

n  result 


the  result  will  be  <a), ,  ...  ,fj. ...  iu  >.  Thu 
1  i  n 

queue  element  <af will  be  attached  to 
the  queue  of  w^.  Otherwise,  all  other  functions 
will  be  attached  to  the  READY  queue  as  with  the 
construction  form. 

insert-associative:  When  applied  to  a  com¬ 
plete  sequence,  insert-associative  can  be  Imple¬ 
mented  In  terms  of  other  forms.  When  applied  to 
an  incomplete  sequence,  this  form  Is  similar  to 
apply- to-all. 

Condition!  There  are  two  ways  to  Implement 
the  conditional  form,  (p-°f;g)  !  parallel  and  non¬ 
parallel,  Both  would  have  the  same  semantics,  but 
a  parallel  conditional  would  evaluate  p,  f,  and 
g  In  parallel.  This  is  not  always  desirable, 
since  considerable  processing  might  be  wasted 
evaluating  the  alternative  which  will  not  be  cho¬ 
sen.  This  is  a  raal  problem  In  loops  closed  by  a 
conditional,  since  a  parallel  condition  form  would 
look  ahead  beyond  the  end  of  the  loop.  Other 
times,  however,  parallel  evaluation  of  p,  f,  and 
8  will  speed  up  execution. 

For  the  non-parallel  condition,  evaluation 
<(p>f i g) , x,uregu^r,D>  will  create  a  new  functional 


form,  choose.  <(choose  f  g  x) ,  u)  ,u)  , 

temp  result, 

will  be  placed  on  the  queue  of  uj 


will  be  placed  on  the  queue  of  uj  and  <p,x,u> 

temp  1 

terap'^”  *>e  Placed  on  the  READY  queue.  Once 

p  returns  a  value,  the  choose  form  will  he  acti¬ 
vated,  which  will  select  either  f:x  or  g:x  as  a 
result . 

The  parallel  conditional  can  be  expressed  In 
terms  of  construction  and  a  new  primitive  function, 
coud.  A  parallel  (p-*f;g)  would  be  expressed  hy 
cond» [p, f ,g] ,  where  cond  beheveB  like  (l-»2,,i). 

Tile  parallelism  results  from  the  parallel  function 
evaluation  used  by  construction.  When  p  returns 
a  value,  the  unused  function,  f  or  g,  will  become 
garbage  and  terminate. 

l’rimltlve  Functions:  Different  primitive 
functions  require  various  degrees  of  completeness 
before  being  executed.  A  few  examples  are: 

+  requires  a  complete  object, 
rength  requires  a  complete  sequence. 

requires  a  sequence  whose  firm  ’3  elements 
are  not  fi' s. 

id  permits  any  Incomplete  object. 


The  only  other  aspect  of  primitive  functions 
related  to  parallelism  is  the  ability  of  some 
functions  to  decompose  themselves  when  applied  to 
Incomplete  sequences  (see  "reverse"), 
l’n u'essor  Synchro 1 1 izu t  i on 

Only  two  operations  require  synchronization 
of  the  processors.  First,  requests  for  new 
objects  must  be  synchronized.  This  can  be  ac¬ 
complished  by  various  techniques,  depending  on  the 
exact  memory  organization.  The  simplest  would  use 
a  conventional  free  list  protected  from  multiple 
accesses  with  a  semaphore.  An  "intelligent  memory 
might  be  able  to  handle  multiple  memory  requests 
Internally. 

The  other  need  for  synchronization  lies  in 
the  only  object  which  can  be  updated:  the  incom¬ 
plete  atom.  The  time  between  finding  an  Incom¬ 
plete  atom  and  attaching  an  element  to  its  queue 
must  he  protected  from  completion  of  the  atom. 

This  could  be  accomplished  with  a  semaphore  on 
each  Incomplete  atom.  Since  these  queues  are 
not  as  active  as  the  READY  queue  and  the  time 
duration  between  finding  an  incomplete  atom  ami 
using  its  queue  is  short,  little  time  would  be 
lost  on  processor  synchronization. 

The  READY  Queue 

The  READY  queue  must  be  an  extremely  fast 
queue,  since  all  functions  must  pass  through  it, 

As  long  as  all  Instructions  put  into  the  READY 
queue  are  eventually  given  to  processors,  it  Is 
not  important  to  force  specific  queue  behavior  on 
the  READY  queue.  Also,  it  is  not  necessary  to 
have  multiple  READY  queues  for  different  proces¬ 
sors  if  processors  pull  only  the  type  of  func¬ 
tions  they  need  from  a  single  READY  queue,  although 
this  could  Involve  unnecessary  waiting  for  the 
proper  function  type. 

A  PROGRAMMING  EXAMPLE 

A  characteristic  example  of  the  parallelism 
introduced  by  the  FP  computer  is  found  in  a 
sorting  program.  A  merge-sort  program  written  In 
an  FI*  system  might  be: 

def  SORT  (/MERGE).  (o[  id]) 
def  MERGE  ;;  nuU»/-^2;null*.Y  *J  i 

GREATER0  [1 »]  ,1  »2 )-*apnd  1  o  1 1 .MERGE0 
[J.tl.21); 

apndl  0  |  /  »  / , MERGE.  1 1 1«  / ,  | 

Since  MERGE  is  associative,  /  MF.KG K  can  be 
implemented  with  an  insert-associative  form.  One 
kind  of  parallelism  will  result  from  the  use  of 
the  insert-associative:  the  MERGE  function  will 
be  arranged  in  a  tree  and  all  merges  In  u  level  of 
the  tree  will  execute  In  parallel.  Another  kind 
of  parallelism  arises  when  the  MERGE  operations 
produce  partial  results  through  the  use  of  In¬ 
complete  sequences.  Each  time  a  MERCE  produces 
an  element  of  its  result,  this  element  Is  immedi¬ 
ately  fed  into  the  next  higher  MERGE,  A  diagram 
of  the  data  flow  Is  given  in  Figure  2. 

This  paral lei  ism  was  achieved  completely  by 
the  computer;  ro  explicit  parallelism  was  embedded 
in  the  program.  This  example  should  serve  as  an 
indication  of  the  amount  of  parallelism  which 
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would  naturally  occur  when  •  program  Ik  run  on  an 
KV  computer. 


CONCLUSIONS 

Functional  programming  syatama  provide  a 
hauls  for  a  computer  archltectura  which  Introduces 
parallelism  at  the  moat  basic  level:  the  machine 
language.  Through  the  uae  of  Incomplete  objects, 
a  completely  data-driven  computer  haa  been  de¬ 
signed.  Parallelism  has  been  achieved  without 
complex  synchronisation  machanlams  or  complex 
inter-processor  cosmunication  networks.  Further¬ 
more,  the  computer  could  accommodate  very  large 
masbars  of  processors  for  the  Introduction  of  a 
very  high  degree  of  parallelism. 

This  computer  haa  the  additional  benefit  of  a 
structured  machine  language  with  simple  and 
clean  semantics.  No  instructions  are  provided 
for  the  Introduction  of  parallelise;  this  comes 
automatically.  Thua,  all  programs  run  on  this 
compul.ji.'  taka  advantage  of  available  parallelism 
without  the  aid  of  special  parallel  languages  or 
compilers.  Parallelism  does  not  change  the  seman¬ 
tics  of  a  program,  allowing  tha  programs  to  be 
analysed  without  regard  to  parallel  behavior. 
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ABSTRACT  -  Wa  claim  that  the  principal 
limitation  in  tha  performance  of  currant 
document  preparation  programs  lies  in  the 
inability  of  the  underlying  architecture 
to  efficiently  execute  the  most  frequently 
performed  operation  —  the  movement  of 
data  and  its  reorganization  in  the  compu¬ 
tation  of  line  Images  of  the  output.  We 
present  the  design  of  a  unit  Intended  to 
expedite  this  data  rearrangement,  in  the 
context  of  a  macro-architecture,  and  show 
how  this  unit  can  be  generalized  to 
variety  of  other  processing  tasks. 


1.  INTRODUCTION 

Architectures  are  described  which  utilize 
VLSI  technology  to  directly  address  the 
problems  of  document  format:),  ug  to  which 
computers  are  being  applied  v.ith  increas¬ 
ing  frequency  in  the  rapidly  evolving 
field  of  office  automation. 

Both  a  macro-architecture  and  a  micro- 
architecture  are  described.  The  macro 
architecture  presents  a  framework  within 
which  to  develop  all  of  the  functions 
associated  with  document  processing.  The 
micro-architecture  is  a  specification  of 
the  design  of  a  particular  aspect  of  docu¬ 
ment  processing. 

Our  particular  micro-architecture 
addresses  the  area  of  text  formatting. 
The  major  component  of  this  architecture 
is  the  Fill  Line  Unit  (FLU)  which  performs 
a  function  in  DPM' s  analogous  to  that  per¬ 
formed  by  the  ALU's  in  conventional 
machines.  It  provides  a  first  example  of 
the  realization  in  hardware  of  tha  many 
functions  associated  with  text  processing. 


2.  MOTIVATION 

Computer  tuehnology  is  generally  described 
as  having  progressed  through  several 
stages  of  evolution,  usually  referred  to 
as  generational 

e  First  generation  (1950-1957)  -  vacuum 
tubes  and  miscellaneous  main  memories 

e  Second  generation  (19r<8— 1964) 

transistors  and  random  access  mag¬ 
netic  core  memories 

e  Third  generation  (1965-1975)  -  small 
scale  integrated  circuits  and  random 
access  magnetic  core  or  solid  state 
memories 

e  Fourth  generation  (1975-present) 
medium  scale  integration  and  solid 
state  random  access  memories 


During  the  same  period  of  time  there  has 
been  a  steady  shift  from  primarily  arith¬ 
metic  and  control  computation  to  the  mix¬ 
ture  of  arithmetic  and  symbolic  computa¬ 
tion  typified  by  document  preparation  and 
the  so-called  "office  automation." 


The  changes  in  technology  have  been 
reflected  in  the  architecture  of  the  pro¬ 
cessing  units.  The  introduction  of  a  bus 
structure  was  eventuated  by  the  availabil¬ 
ity  of  large  numbers  of  registers  in  the 
processing  unit  with  the  transition  to 
third  generation  systems.  The  introduction 
of  cache  memories  came  with  the  availabil¬ 


ity  of  solid-state  memories.  However,  the 
architecture  of  computers  has  not  dramati¬ 
cally  been  af7ected  by  the  changes  In  the 
typical  application  mix. 


In  [7]  Mukhopadhyay  surveys  architectural 
considerations  for  non-numeric  processing 
and  points  out  that,  "With  the  prolifera¬ 
tion  of  computers  in  all  spheres  of  human 
civilization,  most  of  what  will  be 
expected  of  future  computers  will  be  non- 
numerical . . . Ex istlng  computer  architecture 
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does  not  provide  efficient  non-numeric 
computation." 

Much  of  the  research  in  non-numeric  pro¬ 
cessing  of  late  has  centered  on  searching, 
sorting  and  pattern  matching  hardware  for 
database  machlnes[8].  Architectures  for 
document  preparation,  and  in  particular 
text  formatting  systems,  need  not  use  spe¬ 
cial  hardware  for  searching,  sorting  or 
pattern  matching. 

We  believe  that  the  major  Improvement  in 
computer  architecture  required  by  document 
preparation  systems  is  the  rapid  and  effi¬ 
cient  rearrangement  of  data  in  memory, 
with  relatively  minimal  aaaoclatad  pro¬ 
cessing.  The  proof  of  such  an  assertion  is 
likely  to  be  quite  difficult,  but  the  data 
in  Table  1  show  the  effect  of  'line  fil¬ 
ling'  only  on  an  admittedly  simple  docu¬ 
ment  processor - rof f  (111. 


I  of  processing  time  processing  time 
lines  w.  line  filling  w/o  line  filling 


r,r, 

(epu  seconds) 

n 

(epu  seconds) 

.  ? 

10  J 

2.3 

1.5 

544 

9.4 

5.3 

876 

11.4 

6.5 

Tabl a  1.  Comparison  of  Processing  Time 
W  and  W/0  'Line  Filling' 


In  Table  1,  the  same  documents  were  run 
through  the  document  processor  twice,  once 
with  'line  filling*  In  which  case  lines 
are  right  and  left  justified,  and  once 
without  'line  filling'  in  which  case  the 
text  is  printed  without  rearrangement.  In 
line  filling,  the  text  is  arranged  so 
that,  on  each  line,  the  maximum  number  of 
words  are  Included  and  if  these  do  not 
quite  fill  the  line,  then  the  words  are 
spaced  out  inserting  added  blanks  between 
words.  Although  this  incremental  process¬ 
ing  requires  relatively  little  computa¬ 
tion,  it  is  very  intensive  in  data  move¬ 
ment  . 

In  this  paper  we  describe  a  document 
preparation  component,  the  Fill  Line  Unit 
(FLU),  which  can  be  used  to  enhance  the 


capability  of  machines  used  heavily  for 
this  type  of  non-numeric  computation.  The 
augmentation  of  architectures  by  means  of 
such  add-on  units  has  many  precedents  in 
the  evolving  architecture  of  computers! 
extended  arithmetic  capability,  I/O  chan¬ 
nels,  cache,  memory  mapping,  and  direct 
memory  access  are  such  enhancements  which 
have  been  introduced  as  the  technology 
became  appropriate. 


1 .  MACRO-ARCHITECTURE 

We  now  describe  a  macro-architecture  (see 
Figure  1)  as  a  framework  for  explicating 
the  concept  of  the  FLU.  In  this  architec¬ 
ture  user  text  is  kept  in  a  Line  Memory 
(LM) ,  a  buffer's  worth  of  lines  for  each 
active  user.  The  state  of  a  formatting 
process  is  kept  at  any  time  in  a  register 
bank  indexed  by  user.  Among  the  registers 
are  the  file  descriptor  register  (FDR), 
the  line  address  register  (LAR)  which 
points  to  the  next-  line  in  a  user's 
buffer,  a  memory  data  register  (MDR)  which 
contains  a  line  fetched  from  line  memory 
or  gotten  from  an  I/O  device,  and  the  line 
count  register  ( LCR )  which  contains  the 
number  of  lines  left  to  process  in  a 
user ' s  but f  or . 

Typically,  a  user's  process  index  is 
placed  in  the  Bank  Select  Register  causing 
the  user's  process  registers  to  be 
selected.  The  LAR  is  used  to  address  the 
next  line  in  the  Line  Memory  to  be  pro¬ 
cessed.  This  line  is  accessed  and  con¬ 
catenated  with  the  present  contents  of  the 
MDR,  the  FLU  unit  is  activated  and  the 
result  placed  back  in  the  MDR,  If  all  the 
lines  of  a  given  user's  buffer  area  have 
been  processed,  then  a  new  buffer's  worth 
is  brought  into  the  LM. 


4.  DESIGN  AND  IMPLEMENTATION  OF  THE 
MACRO- ARCHITECTURE 

So  far  we  have  specified  a  framework 
within  which  a  FLU  could  be  utilized.  Now 
we  specify  the  details  associated  with  a 
text  formatting  application. 

The  registers  in  the  register  bank  have 
only  thus  far  been  partially  spactflad. 
The  text  formatter  registers  in  the  regis¬ 
ter  bank  consist  of  the  left  margin  regis¬ 
ter  (LMR)  which  contains  the  position  of 
the  left  margin  on  a  line  of  output  taxt, 
the  right  margin  register  (RMR)  which  con¬ 
tains  the  position  of  the  right  margin  on 
a  line  of  output  text,  the  page  number 
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register  (PNR)  which  contains  the  number 
of  the  current  page,  the  line  space  regis¬ 
ter  (LSR)  which  contains  the  number  of 
spaces  between  output  lines,  the  page 
length  register  (PLR)  which  contains  the 
number  of  lines  In  an  output  page,  the 
header  register  (HR)  which  contains  the 
header  line  to  be  placed  on  each  output 
page,  the  footer  register  (PR)  which  con¬ 
tains  the  footer  line  to  be  placed  on  each 
output  page,  the  place  register  (PR)  which 
contains  the  piece  of  the  MDR  that  is  left 
after  the  leftmost  portion  of  the  HDR  is 
output,  the  header  bit  (H)  which  Is  set  if 
a  header  is  to  be  output,  the  footer  bit 
(?)  which  Is  set  if  a  footer  is  to  be  out¬ 
put,  and  the  fill  bit  (PL)  which  is  set  If 
the  line  filling  operation  associated  with 
the  HDR  is  to  be  activated. 

The  text  formatter  accepts  text  to  be  for¬ 
matted  along  with  commands  describing  the 
output  format  of  the  text.  Ideally,  we 
envision  a  command  language  that  resembles 
the  language  used  to  edit  manuscripts.  To 
be  brief,  we  will  confine  our  command 
language  to  be  rather  conventional  (see 
Table  2).  It  Is  essentially  Identical  to 
that  proposed  In  (3). 

Both  commands  and  data  are  resident  In 
line  memory.  Lines  are  organised  In  termo 
of  f lytes  (short  for  flagged  bytes) 
there  are  N  flytes  to  a  line.  The  format 
of  a  flyta  Is 


I  type  I  value  I 


In  our  case  there  are  two  types  of 

flytes - data  and  commands.  For  the  data 

flyte  the  value  Is  the  Internal  data  char¬ 
acter  representation.  For  command  flytes 
the  value  Is  an  Instruction  to  be  per¬ 
formed,  the  total  format  of  which  is 


I  type  |  fmt  I  op  I  opnd  1  I  ...  I  opn<)  n  I 


Here  the  command  may  be  comprised  of 
several  flytes.  The  fmt  ,  format  field, 
describes  the  composition  of  the  rest  of 
the  Instruction.  For  instance,  It  might 
specify  the  number  of  operands.  The  og 
field  provides  the  command  which  is  to  Ee 
performed.  A  comprehensive  treatment  on 


the  selection  of  Instruction  formats  can 
be  found  In  (4 , S] . 

Let  us  consider  an  Instantiation  of  this 
format  for  our  Instruction  set. 

Here  we  consider  a  flyte  to  be  eight  bits 
long  and  character  data  to  be  In  ASCII 
representation.  Since  internal  ASCII 
Involves  only  7  bits,  we  can  have  a  type 
field  of  1  bit  and  still  fit  a  data  char¬ 
acter  In.  There  will  be  two  command 
formats---no  operand  and  one  operand,  dis¬ 
tinguished  by  the  setting  of  the  second 
bit,  The  no  operand  format  will  take  up 
the  remaining  6  bits  of  the  flyte,  the  one 
operand  format  will  also  latch  on  to  the 
next  8  bit  flyte  (which  must  have  left  bit 
set)  for  its  argument  (range  0-127). 

no  operand i 


I  1  I  0  I  command 


one  operand; 


i  1  I  1  I  command  I  111  command  I 


5.  IM  PUL  Line  UNIT i  THE  MICRO¬ 
ARCH  itectOre~ 

The  unit  which  we  have  chosen  to  call  the 
Fill  Line  Unit,  (FLU),  plays  a  role  in 
document  preparation  analogous  to  that 
played  by  the  arithmetic  logic  unit, 
(ALU) ,  in  scientific  computation.  Like 
ALU's,  PLU's  have  a  decomposition  theory 
which  allows  descriptions  as  serial, 
Beries-parallel ,  or  parallel  realizations 
with  the  appropriate  equipment/speed 
tradeoffs  and  function  of  two  operands  and 
a  carry.  In  FLU'S,  the  corresponding 
situation  Is  Been  In  Figure  2,  which  is  a 
simplification  of  the  FLU. 

Here  I  Is  a  register  which  stores  1 
flytes,  where  1  Is  chosen  large  enough  to 
generate  a  complete  line  image  of  charac¬ 
ters.  The  data  In  I  at  cycle  n  are  used  to 
generate  the  output  line  L,  and  any  extra 
flytes  are  then  stored  in  0.  Thus  the 
functional  dependencies  are: 

Ln  •  g(In) 

On  ■  h(In) 

Further  the  new  value  of  I  is  determined 


by  the  values  of  0  and  MDR ; 


In  «  f  (MDRr, ,  On- 1  ) 

(Tha  analogy  batwean  an  ALU  and  a  FLU  can 
now  ba  aaan  more  claarly  slnca  0  is  Ilka  a 
carry  and  L  Is  Ilka  a  sum.)  Tha  two  units 
oT  combinational  logic  shown  in  Figure  2, 
SI  and  S2  ara  than  tha  primary  objects  of 
intarast  in  tha  synthesis  of  tha  FLU. 

Tha  above  discussion  has  ignored  signals 
which  originate  in  S2  and  sat  global  state 
information,  and  signals  feeding  tha  glo¬ 
bal  state  Information  into  tha  FLU. 

Tha  complexity  of  tha  FLU  is  thus  seen,  in 
Figure  2,  to  depend  on  the  complexity  of 
the  units  f.l  and  S2,  the  remaining  units 
being  conventional  registers.  The  role  of 
the  SI  unit  is  to  shift  Inputs  from  MDR  to 
the  right  by  the  length  of  tha  data  in  tha 
0  unit,  with  tha  non-empty  data  in  0  being 
transferred  directly  into  the  leftmost 
stages  of  1.  SI  performs  a  uniform  shift 
of  all  the  elements  of  MDR;  symbolically, 

MDRn-1 , k  — >  In  ,k+d 

where  Rx,y  is  the  contents  of  stage  y  of  a 
given  register  R  at  time  x,  and  d  is  the 
amount  of  shift  required.  If  the  size  of 
the  MDR  i 3  s  flytes,  and  each  flyte  con¬ 
sists  of  k  bits,  then  the  complexity  of  Si 
will  be  proportional  to  k*s*log (dmax)  , 
whore  dmax  is  the  maximum  possible  shift 
required.  In  Figure  3,  we  show  a  realiza¬ 
tion  of  an  1  unit  for  dmax  ’3,  k  «  1  , 
and  <s  »  3. 

Ih  Figure  3,  the  binary  encoded  shift  con¬ 
trol  on  the  left  is  via  a  register  q  which 
requires  log(dmax)  bits  of  storage  to  con¬ 
trol  the  shift  operation  of  si,  and  for 
the  pth  shift  control  bit,  qp  ,  stage  i  is 
shifted  ri’ht  qp  *  pp  pieces;  i.e.  no 
right  shift  If  qp  *  a,  and  a  right  shift 
of  pp  if  qp  ■«  1. 

The  S2  is  considerably  more  complicated 
since  it  performs  decoding  of  the  flytes 
to  interpret  the  embedded  control  informa¬ 
tion  end  non-uniform  shifts.  We  can  ima¬ 
gine  the  structure  of  the  S2,  as  a  uniform 
cascade  of  stages  as  shown  in  Figure  4. 
Figure  4  is  a  conceptual  decomposition  of 
the  S2  into  a  linear  cascaded  array  of 
identical  flyte  stages.  If  Ii  contains 
data,  the  control  unit  of  stage  1  will 
pass  the  control  signals  through  and  gen¬ 
erate  a  shift  of  tha  appropriate  amount. 
If  Ii  contains  a  control  flyte,  and  is 
therefore  not  to  be  shifted  to  0,  then  the 


control  information  passed  to  adjacent 
units  is  modified,  and  ll  would  be 
deleted.  It  might  than  be  necessary  for 
units  to  the  right  of  11  to  cause  a  left 
shift  of  their  contents  to  L. 

We  do  not  give  a  complete  description  of 
the  S2  but  describe  only  the  logic  needed 
to  generate  a  filled  line,  omittl".  St 
logic  needed  for  the  other  commands  and 
functions.  Further,  we  shall  describe  the 
processing  as  done  in  e  single  clock 
cycle.  Assuming  a  maximum  line  length 
between  128  and  255,  tha  following  bus 
lines  are  required  (of  course,  using  more 


clock  cycles 

allows 

fewer 

bus  1 lnes  since 

lines  can  be 

shared 

among 

functions) t 

Fund  ion 

1  of 

bits 

Notation 

right  margin 

8 

RM { 1 -8 ] 

right  end 

8 

RE [1-8 J 

of  text 

word  count 

6 

CT ( 1 -6 ) 

fill  status 

1 

F 

fill  shift 

5 

FSll-5) 

fill  parameter  2 

FPll-2] 

r  lghtmost 

1 

S 

space  seek 

actual  shift 

5 

SHU -5) 

starting  at  the  left,  the  word  count  Is 
set  to  zero  and  passed  to  the  right,  being 
incremented  at  each  space  following  a 
non-space.  The  right  margin  position,  r, 
is  encoded  on  RM.  At  position  r  this 
information  is  decoded  and  passed  to  the 
left  on  S  until  the  first  space  immedi¬ 
ately  to  the  right  of  a  non-space,  at 
position  s,  and  position  s  Is  than  encoded 
on  RE.  The  difference  between  RM  and  RE 
is  then  placed  on  FS  and  the  value  of  FS 
divided  by  CT  Is  placed  on  FP. 

FS i  is  the  Incremental  amount  of  shift 
required  at  stages  following  1  to  right 
justify  the  line,  and  SHi  is  the  actual 
shift  of  stage  i.  SHI  and  FS1  are  com¬ 
puted  from  SHi-1  and  FSi-l  with  SHI  ■  0. 
If  Ii  Is  not  blank  SHi  -  SHi-1  and  FSi  ■ 
FSi-l.  If  Ii  is  blank  and  I l—l  la  not 
blank  then  If  FSi-l  >■  FP  then  FSl  •  FSl-1 
-  FP.  (Note  that  FSl  +  SHI  -  FSl;  the 
actual  shift.  at  stage  i  and  the  added 
shift,  required  is  a  constant.) 

Current  component  densities  are  adequate 
to  contain  a  fully  parallel  FLU  on  a  sin¬ 
gle  chip  for  a  maximum  line  size  of  130 
charactersIS) . 


6.  EXTENSIONS  TO  OTHER  TP  FUNCTIONS: 
ENHANCING  THE  MICRO -ARCHITECTURE 


6.2  HYPHENATION 


The  FLU  described  in  the  previous  section 
has  shown  how  to  implement  many  of  the 
classical  functions  as  described  in  [3]. 
Other  text  processors  may  choose  to  add 
functions  to  thetie  to  produce  a  Cadillac 
version  text  processor.  While,  for  reasons 
of  style,  we  prefer  the  simpler  text  pro¬ 
cessors  -  especially  in  an  expository 
treatment  -  it  Is  worth  considering 
briefly  how  the  architecture  described  is 
adaptable  to  some  of  these  deluxe 
features.  The  two  which  we  shall  describe 
are  text  macros  and  hyphenation. 

6.1  TEXT  MACROS 

A  text  macro  is  a  sequence  of  flytes  which 
replace  a  single  flyte  in  the  source  text 
prior  to  execution.  We  shall  assume,  for 
simplicity,  that  the  replacement  text  is 
fully  expanded  although,  in  principle,  it 
need  not  be.  Let  m  be  the  macro  variable 
flyte  Invoking  the  macro  and  assume  that  m 
occurs  in  a  source  line  x  m  y.  Assume 
further  that  M  is  the  expansion  of  m.  Then 
after  macro  substitution  the  source  text 
is  x  M  y,  where  M  is  i  sequence  of  flytes. 
The  transformation  from  x  m  y  to  x  M  y 
does  not  affect  x  and  involves  shifting  y 
to  the  right  by  length(M)  -  length(m)  . 
Then  the  substitution  text,  M,  must  be 
placed  in  the  resultant  gap.  Now  the  FLU 
architecture  is  designed  to  facilitate 
exactly  this  kind  of  data  movement. 

within  the  context  of  the  FLU,  the  macro 
definitions  could  be  stored  in  an  associa¬ 
tive  ROM.  Upon  invocation  of  the  macro  the 
replacement  text  would  be  retrieved  from 
the  ROM  and  shifted  to  the  appropriate 
position  (using  the  SI  unit)  . 

Conditional  expansion  of  macros  based  upon 
macro  variable  flytes  and  external  vari¬ 
ables  (e.g,  register  contents,  transforma¬ 
tions  on  register  contents)  is  also  possi¬ 
ble.  A  condition  PLA  having  inputs  of 
macro  variable  flytes  and  external  vari¬ 
ables  can  generate  an  output  c  depending 
upon  which  conditions  are  met.  The  associ¬ 
ative  ROM  holding  the  macro  definitions 
would  be  accessed  by  the  key  (m,c)  where  m 
is  the  macro  variable  flyte.  The  input 
(m,c)  would  act  as  a  composite  key  for  the 
macro  definition. 


Most  hyphenation  schemes  depend  on  some 
simplified  algorithm  to  approximate 
correct  hyphenation.  We  shall  assume  that 
we  have  available  a  small  hyphenation  box, 
H,  whose  function  is  as  follows:  Given  a 
sequence  of  n  letters  representing  the 
tail  of  a  word  (possibly  the  whole  word), 
and  a  parameter  q,  H  will  determine  the 
place  closest  to  and  less  than  q  where  a 
hyphen  can  be  placed.  While  we  have  not 
studied  hyphenation  algorithms  in  detail, 
we  do  not  think  that  the  design  of  such  a 
unit  is  extremely  difficult. 

Now  the  FLU  will  gate  the'  word  to  be 
hyphenated  to  H  with  parameter  q  indicat¬ 
ing  where  the  hyphenation  is  needed  and 
will  use  the  returned  signals  to  control 
shifting  and  line  filling. 


7 .  EXTENSION  TO  THE  HOST  ARCHITECTURE: 

ENHANCING  ?HE  MACfi(5-AR(iHITECTURB 

In  Section  3  we  provided  a  strictly 
vanilla  architecture  as  a  vehicle  for 
presenting  the  FLU.  We  believe  that  such 
an  architecture  can  be  generalized  to  one 
of  a  document  preparation  machine.  Per¬ 
tinent  ideas  to  this  end  will  now  be 
presented,  but  in  the  context  of  a  text 
processing  environment. 

The  architecture  of  a  computer  bystem 
should  be  responsive  to  the  needs  of  the 
user.  In  a  text-formatting  environment, 
there  is  a  need  for  entering  information 
from  interactive  terminals  and  outputlng 
formatted  information  from  printers  or 
terminals. 

Users  input  requests  and  the  system 
translates  them  'Into  actions  that  it  can 
execute.  These  actions  can  be  realized  by 
functional  units,  micro-coded  subroutines, 
etc.  For  Instance,  the  request 
formatdlle  descriptor)  might  be 

translated  By  the  system  into  actions 
which  include:  transform ( 1 lne)  , 

get  (buffer)  ,  outputUlne)  ,  Here 
transform (1 ine)  would  get  the  next  line 
from  a  main  memory  buffer  and  format  it 
for  printing,  output ( line)  would  give  the 
line  to  a  suitable  output  device,  and 
get  (buffer)  would  replenish  the  line 
buf far . 

For  a  given  request,  its  associated 
actions  are  related  by  rules  for  their 
application.  These  rules  can  be 
represented  by  a  state  diagram  where  the 
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states  represent  the  actions  and  the  tran¬ 
sitions  represent  their  outcomes  (see  Fig¬ 
ure  5) . 

In  the  high-level  architecture  for  the 
text-formatting  machine  there  exists  a 
supervisory  unit  which  contains  the  state 
diagrams  for  all  requests  and  that 
sequences  through  these  aotlons  as  the 
requests  progress. 

Figure  6(a)  gives  a  conceptual  view  of  the 
Supervisor,  Figure  6(b)  gives  a  suitable 
refinement,  and  Figure  6(c)  gives  the  exe¬ 
cution  cycle  for  the  refinement.  Note 
that  terminals  put  requests  on  a  queue 
which  is  eventually  processed  by  the 
Supervisor.  The  outcomes  of  each  action 
execution  are  feedback  to  the  Supervisor 
for  further  processing  according  to  the 
state  diagram  associated  with  the  execut¬ 
ing  request. 

This  supervisory  model  forms  the  basis  for 
a  multiuser  Interactive  system.  (More 
about  this  approach  can  be  found  In 
(1,2)).  Here  the  actions  associated  with 
the  executing  request  of  one  user  can  be 
overlapped  with  the  actions  associated 
with  the  executing  requests  of  the  other 
users.  Thus  we  have  a  pipeline  organiza¬ 
tion  where  we  are  always  executing  dif¬ 
ferent  parts  of  separate  requests  Tn 
parallel . 

expanding  on  this  structure  yields  a 
machine  architecture  as  pictured  In  Figure 
7.  Users  enter  requests  to  create,  edit 
and  process  text  to  be  formatted.  Output 
can  appear  on  either  the  Initiating  termi¬ 
nal  or  on  a  line  printer. 

The  Supervisor  controls  the  sequence  of 
action  executions  while  the  functional 
units  realize  the  actions  In  terms  of 
micro-orders,  register  transfers,  etc.  As 
an  example,  let  us  specify  In  micro-orders 
the  semantics  of  the  transform  action! 

transform(llne)  ■ 

Bank  select  <-  get  queue(), 

if  (  (LCR)  •  0) 

outcome (EXHAUSTED) > 

el se  ( 

MDR  <-  MDR  o  LM [LAR ] , 

MDR  <-  TRAN (MDR)j 
LAR  <-  LAR  ♦  1) 

LCR  <-  LCR  -  1) 
outcometOKAy ) i 

) 

Here  outcome ( code)  places  a  return  code  on 
the  Request  Queue,  get_queue()  gets  the 


next  entry  from  the  transform  action 
queue.  Bank  select  Is  a  register  which 
indexes  the  appropriate  user's  register 
set,  the  operation  'o'  concatenates  the 
contents  of  the  next  line  in  line  memory 
to  the  MDR,  and  TRAN (MDR)  provides  the 
combinational  logic  function  to  do  the 
line  filling  and  manipulating  operations. 


8.  GENERALIZATIONS!  OTHER  APPLICATIONS  OF 
flit  XRfriffEffljRE - 

There  are  several  essential  features  in 

the  design  of  the  FLU  which  suggest  gen¬ 
eralisations  to  functions  other  than  docu¬ 
ment  preparation.  First,  the  FLU  operates 
on  a  unit  of  data  which  is  much  larger 
than  the  elemental  storage  component  typi¬ 
cally  processed  at  the  instruction  level 
of  the  computer.  This  can  be  considered 
the  outer  loop  of  the  FLU  control.  Second, 
within  the  unit  of  data  being  processed  by 
the  FLU  there  is  a  functional  pattern  sug¬ 
gesting  Iterative  decompositions  —  which 
can  be  parallel  or  series-parallel  — 
which  are  amenable  to  replication  at  the 
component  level.  Third,  within  the  data 
unit  processed  by  the  FLU  there  Is  a  com¬ 
bination  of  data  and  control  elements 
similar  to  a  tagged  architecture. 

The  general  action  of  the  FLU  may  thus  be 
understood  at  the  outer  level  of  control 

asi 


while  (FOREVER)  ( 

if  (DATA  UNIT  NOT  COMPLETE) 
FETCH  MORE  INPUT, 

else 


) 


PROCESS  THE  DATA  UNIT, 


which  In  the  specific  document  preparation 
component  case  becomes! 

while  (FOREVER)  [ 

If  (OUTPUT  LINE  NOT  COMPLETE) 

FETCH  ANOTHER  INPUT  LINE, 

else 

GENERATE  AN  OUTPUT  LINE, 

) 


In  either  case  the  Input  is  a  sequence  of 
flytes  in  which  the  data  and  control  are 
Intermixed,  and  the  output  is  a  sequence 
of  data  flytes.  The  relationship  between 
the  size  of  the  input  quantum,  the  size  of 
the  output  quantum,  and  the  intermediate 
storage  within  the  FLU  must  be  studied  to 
obtain  optimal  performance. 


The  ■am  processing  loop  ia  appllcabla  to 
a  variaty  of  programs  in  UNIX  which  hava 
asaantlally  this  ovarall  control  structure 
-  such  as  awk  [10],  sad  [9],  and  grap  [91. 
(Awk  and  sad  analyta  taxt  lina  by  lina, 
wETTe  grap  saarchas  linas  to  datact  a  pat- 
tarn.)  ilia  FLU  can  ba  adaptad  to  a  variaty 
of  programs  by  having  tha  cascade  control 
logic  of  tha  S2  unit  under  microprogram 
control . 


9.  CONCLUSION 

Our  architectures  provide  for  a  synthesis 
of  vary  large  scale  integrated  circuit 
technologies  and  program  structure  con¬ 
cepts  to  respond  to  the  needs  of  office 
automation. 

The  macro-architecture  and  the  micro- 
architecture  which  we  have  described  com¬ 
bine  to  provide  a  state-of-the-art  unit 
suited  to  an  increasing  number  of  applica¬ 
tions. 
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ABSTRACT 

Thin  paper  dltcusses  the  design  of  a 
special  purpose  computer  to  be  used  In  the 
scanning  of  text.  The  design  of  this 
machine  allows  It  to  operate  at  a  reason¬ 
ably  high  level  when  performing  text 
searches.  This  capability  not  only  sim¬ 
plifies  the  requirements  of  the  transla¬ 
tion  process  used  to  derive  machine  code 
from  user  enquiries  but  also  enhances  the 
speed  of  the  device  which  Is  an  essential 
feature  If  data  Is  to  be  scanned  while 
being  taken  from  a  rotating  storage  med¬ 
ium.  Of  special  Interest  Is  the  design  of 
the  term-detection  unit  which  Incorporates 
features  which  should  be  of  use  In  a 
direct-execution  arhc 1 1 ec t ur e  ,  specifi¬ 
cally  those  modules  which  are  responsible 
for  the  recognition  of  keywords  and  tokens 
In  a  stream  of  source  text. 


INTRODUCTION 

In  the  past  few  years  we  have  seen  a 
growing  Involvement  with  systems  which 
have  as  their  main  function  the  scanning 
of  extremely  large  data  bases  of  textual 
Information  containing  perhaps  billions  of 
characters.  Examples  of  such  applications 
Include  text  retrieval  systems  for  intel¬ 
ligence  reports,  treatises  and  corpora  In 
law  libraries,  medical  bibliographic  ser¬ 
vices,  and  large  repositories  of  newspaper 
articles. 

This  literature  searching  Is  mainly 
characterized  by  the  fact  that  the  textual 
Information  is  not  structured.  Due  to  the 
way  the  Information  Is  collected  and 
because  of  the  neture  of  the  Information 
It  Is  usually  difficult  to  provide  ade¬ 
quate  cost-effective  Indexing  systems. 
Consequently,  If  there  is  any  subdivision 
of  the  information  content,  It  will  he 
such  that  the  Information  is  grouped  into 
categories  which  are  very  extensive  in 
scope.  In  euch  a  situation,  the  litera¬ 
ture  search  Is  accomplished  by  scanning 
the  entire  text.  Information  Is  extracted 


when  It  satisfies  the  requirement*  ul  i 
user  query  which  should  specify  a  s  u  I  I  1  - 
eient  number  of  constraints  on  the  search 
to  produce  the  required  documents  and  lit¬ 
tle  else. 

The  internal  formatting  of  the  text 
may  he  rather  inconvenient  and  limited  to 
standard  punctuation  although  special 
haraetors  may  be  used  to  delimit  mnl 
hence  define  various  text  groupings  such 
as  sentences,  paragraphs,  sections,  docu¬ 
ments  etc. 

Various  papers  1  1  ,  2  , 1 , 4  ,  A  ,  fi  |  have 
dlscusBed  a  variety  of  architecture*  for 
text  retrieval  and  In  (71,  Hollaar  dla- 
cussea  the  problems  associated  with  such 
endeavours  and  presents  a  survey  of  some 
of  the  architectures  which  are  of  current 
Interest.  In  (fi)  Chu  suggests  that, 
research  should  explore  the  hardware, 
software  trade-off*  for  particular  appli¬ 
cations  Involving  high-level  constructs. 
This  paper  Is  essentially  an  attempt  to 
bring  tone  of  the  high  efficiency  and  high 
performance  aspect*  of  d 1 r e c t - exe c u 1 1  on 
architecture  to  the  special  purpose  appli¬ 
cation  of  text  scanning. 


SYSTEM  FUNCTIONS 

In  text  retrieval  systems,  a  three 
step  process  Is  Involved  In  the  cap  tote  of 
textual  information: 

1)  query  translation 

2)  term  detection 

3)  query  resolution 

The  user  terminal  (see  fig,  1)  passes 
to  the  Bystem  an  lnformution  request  which 
is  expressed  as  a  query.  Examples  of  such 
an  inquisition  are  as  follows: 

A  Keyword  Search 

Retrieve  any  document  that  contains 
the  character  string  A. 

( A , 1  C,D)#n  Threshold  'OR' 

Retrieve  any  document  that  contains 
at  least  n  of  the  different  character 
slrlnRS  A,B,C,D.  Ncte  that  If  u«l 
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ra.-,  .  ■ 


thi»  is  an  "OR"  operation;  hence  the 
retrieved  docuaent  contains  one  or 
aore  of  the  strings  A,B,C,  or  D.  If  n 
equals  the  nuaber  of  entries  in  the 
list,  then  this  Is  an  "AND"  opera¬ 
tion;  the  retrieved  docuaent  aust 
contain  all  the  Indicated  strings  In 
any  order. 

A  AND  NOT  It  Logical  Repressions 

Retrieves  any  docuaent  that  contains 
the  character  string  A  but  not  the 
character  string  B. 

<A,B>ln  Directed  Proxlnity 

Retrieve  any  docuaent  that  contains 
the  character  string  A  followed  by 
the  character  string  B  within  n  char¬ 
acters  . 

|A,B)#n  Undirected  Proxlalty 

Retrieves  any  docuaent  that  contains 
character  strings  A  and  B  within  n 
characters  of  each  other. 

A1 ! I B  "Don't  Care"  Characters 

Retrieves  the  docuaent  with  the  char¬ 
acter  string  A  followed  by  three 
arbitrary  characters  followed  by  the 
character  string  B. 

In  the  next  step,  the  query  transla¬ 
tor  will  create  the  necessary  aachine  code 
and  will  tend  It  (along  with  the  required 
data  Iteas)  to  the  query  resolution  aoduie 
which  guides  the  behavior  of  the  control 
unit  in  the  tens  detector  and  gathers  res¬ 
ponses  froa  the  tera  detector  In  order  to 
resolve  queries. 

Since  It  la  necessary  to  scan  a  vast 
aaount  of  text,  a  high  speed  of  execution 
In  the  tera  detector  and  query  reaolutlon 
aodulea  Is  of  utaoat  laportance.  In  this 
design,  the  scan  operations  are  designed 
to  function  at  a  raaaonably  high  level. 
During  aost  of  the  tlae  a  search  operation 
will  be  carried  out  as  the  execution  of 
one  Instruction  In  the  search  control 
unit.  If  the  Input  text  currently  being 
exanlned  contains  characters  that  produce 
a  successful  match  with  a  given  tern,  then 
the  execution  of  various  Instructions  nay 
be  effected  In  order  to  accoapllsh  some 
aspect  of  the  query  resolution,  but  In 
aost  circumstances  the  alcrocode  executed 
during  a  acan  Instruction  will  rapidly 
skip  over  text  characters  which  do  not 
match  with  any  of  the  given  terms.  As  we 
shall  see,  It  is  possible  to  design  hard¬ 
ware  facilities  which  will  accoapllsh  some 
of  tun  query  resolution  without  resorting 
to  the  execution  of  code  in  the  Query 
Resolution  Processor  (QRP) . 

The  modular  structure  of  tha  nucleus 
of  thrc  text  scanning  systea  1s  presented 
In  fig.  2.  Because  of  Its  functional 
capabilities,  it  includes  the  tera  detec¬ 
tion  unit  of  fig.  1  and,  in  addition  to 
this,  It  also  Involves  some  aspects  of  the 
query  resolution  block. 


The  tern  detection  aoduie  receives 
text  froa  a  suitable  source  and  attempts 
to  aateh  character  substrlnga  in  this  text 
with  the  character  string  terma  stored  In 
the  strlnR  aeaory  contained  within  the 
aoduie , 

When  a  successful  natch  la  detected, 
the  match  line  is  given  an  active  signal 
and  the  memory  addreaa  of  the  matching 
string  la  passed  down  to  the  status  FIFO 
so  that,  if  necessary,  the  match  can  be 
"logged"  for  future  use  by  the  QRP.  The 
address  is  also  passed  to  the  Interrupt 
Generation  Unit  which  can  be  used  to 
implement  tha  "thraehold-or"  function  men¬ 
tioned  earlier.  The  IGU  also  decides 
whether  the  addrens  Is  to  be  logged  In  the 
status  FIFO. 

The  delimiter  detection  unit  Issues 
Interrupts  whenever  a  delimiter  passes  in 
the  text  stream.  It  Is  mainly  used  to 
detect  the  beginning  of  successive  docu¬ 
ments  In  the  source  text  since  many  of  the 
queries  will  be  related  to  the  contents  of 
a  document. 

Thus,  an  Interrupt  can  be  Initiated 
for  any  one  of  the  following  eventai 

a)  Detection  of  a  delimiter 

b)  Detection  of  a  term 

c)  Completion  of  a  threshold-or 
during  passage  of  a  document. 

In  all  cases,  an  Interrupt  line 
causes  the  QRP  to  acknowledge  an  event 
which  Is  Important  to  the  reaolutlon  of  a 
query.  If  It  cannot  Immediately  deal  with 
such  an  event,  all  pertinent  Information 
Is  temporarily  logged  as  status  In  tha 
FIFO  buffer  until  the  QRP  can  find  tha 
tlae  to  accept  It. 


TERM  DETECTION 

The  Input  to  the  term  detector  la 
taken  from  a  source,  for  example,  a  disk 
drive,  which  can  Issue  a  serial  stream  of 
characters.  It  Is  anticipated  that  the 
amount  of  processing  tlae  required  between 
character  shifts  will  be  less  than  400 
nanoseconds.  Since  typical  transfer  rates 
for  a  disk  are  about  one  byte  par  microse¬ 
cond  this  system  should  be  able  to  accept 
data  directly  from  a  disk  without  the  need 
for  buffer  memories  or  FIFO's. 

The  heart  of  the  term  detector  con¬ 
sists  of  a  lengthy  shift  register  which 
shifts  in  source  text  one  byte  (a  single 
character)  each  time  a  shift  operation  la 
Issued  by  search  control.  The  shift 
register  Is  capable  of  holding  32  charac¬ 
ters  which  are  available  froa  the  "paral¬ 
lel-out”  lines  of  the  shift  register. 
These  32  characters  can  be  compared  with 
any  one  of  236  strings  (or  terms)  in  a 
"string  memory"  which  has  a  data  bus  capa¬ 
ble  of  dealing  with  32  characters  In  par¬ 
allel.  Comparisons  are  accomplished  by  a 
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linear  Array  of  comparitors  placed  between 
the  string  memory  and  the  shift  register. 
It  is  anticipated  that  each  character 
position  in  the  string  memory  will  Involve 
a  7  bit  ASCII  code  and  an  additional  bit 
used  to  signify  a  "don't  care"  or  uncondi¬ 
tional  aatch  character. 

The  string  aeaory  is  a  standard 
static  RAM  since  the  use  of  associative 
memory  for  this  function  would  be  very 
costly  at  the  present  t lae  .  However,  it 
is  obvious  that  some  type  of  parallel 
search  must  be  made  and  consequently,  the 
parallel  outputs  from  the  middle  four 
character  positions  of  the  shift  register 
lead  to  an  associative  memory  which  also 
has  a  word  depth  of  25*1.  Me  will  refer  to 
these  four  characters  as  the  "partial 
match"  characters.  Prior  to  the  scan 
operation,  the  ayatem  will  enaure  that  the 
term  in  word  n  of  the  string  memory  cor¬ 
responds  to  four  partial  match  characters 
of  word  n  In  the  associative  memory  (CAM) 
(see  fig.  3).  As  text  streams  througn  the 
shift  register,  a  comparison  can  be 
effected  between  the  partial  match  outputs 
and  all  the  words  lr.  the  associative 
memory.  If  a  match  is  detected,  the 
address  of  tha  matching  word  is  derived 
from  an  encoder  which  it  driven  by  tha 
match  outputs  of  the  associative  memory. 
This  address  is  fed  to  the  string  memory 
so  that  another  comparison  can  be  accom¬ 
plished,  this  time  Involving  the  full 
string.  This  final  full  comparison  will 
determine  whether  the  contents  of  the 
shift  register  contain  one  of  the  terms 
required  by  the  user  query.  With  a  suit¬ 
ably  fast  RAM  for  the  string  memory,  both 
comparlsona  can  be  easily  accomplished  In 
the  t loe  Interval  be  twean  successive 
shifts  as  characters  stream  off  disk. 

Our  only  constraint  Is  that  all  words 
In  the  associative  memory  be  unique. 
Since  most  terms  In  the  string  memory  are 
not  going  to  be  r  full  32  characters  In 
length,  we  should  be  free  to  locate  a  term 
within  lta  word  ao  that  It  assumes  a  posi¬ 
tion  such  that  the  four  characters  in  the 
partial  match  poaltlona  are  different  from 
all  the  rest. 

For  example,  euppose  we  are  searching 
the  data  baa*  for  the  following  ten  terms: 
"  GUTS  AND  DOLLS  " 

"  THE  NIGHT  OF  THE  IGUANA  " 

"  A  STREETCAR  NAMED  DESIRE  " 

"  WHAT  MAKES  SAMMY  RUN7  " 

"  THE  DIART  OF  ANNE  FRANK  " 

"  A  LITTLE  NIGHT  MUSIC  " 

"  SWEET  CHARI TT  " 

"  THE  UNSINKARLE  MOLLT  BROWN  " 

"  A  CHORUS  LINE  " 

"  DON'T  BOTHER  ME,  I  CAN'T  COPF.  " 


Successive  words  In  the  string  memory 
might  be  set  up  as: 

I !  1 !  ! I  1  I  1  I !  I  1  CUTS  AND  DOLLS  !  1  I 
!!!!!!!  THE  NIGHT  OF  THF.  IGUANA 
III!  I!  A  STREETCAR  NAMED  DF.SIRF. 
1111!!!!!  WHAT  MAKES  SAMMY  RUN? 
!!!!!!!  THF.  DIART  OF  ANNE  1  RANK 
!!!!!!!!!!  A  LITTLE  NIGHT  MUSIC 
!!  1  !!!!!!!!  I  I  SVEF.T  CHARITY  I  !  1  ( 

!!!!  THE  UNSINKABLE  MOLLY  RROUN 
!!!!!!!!!!!!!  A  CHORUS  LINE  ! ! ! ! 

!  DON'T  BOTHER  ME,  1  CAN'T  COPE 

while*  t  hi*  successive  words  in  I  ho 
associative  memory  would  he: 


The  1  in  tha  above  list  represents  a 
don't  care  or  unconditional  match  charac¬ 
ter  . 

As  can  be  aean  In  the  above  example 
each  entry  In  the  partial  match  columns 
within  the  associativa  memory  la  selected 
from  the  corresponding  character  positions 
of  terms  in  the  string  memory.  After  a 
parallel  comparison  of  source  with  all 
words  in  the  associative  memory  a  success¬ 
ful  match  will  simply  Indicate  a  matching 
substring  and  the  address  of  the  parent 
term  containing  that  substring.  One  mors 
comparison  with  the  parent  term  In  the 
string  RAM  will  serve  to  verify  whether 
the  complete  term  la  In  the  eource  text. 

It  should  b«  noted  that  In  the  inter¬ 
est  of  clarity  we  have  omitted  from  fig.  3 
the  additional  circuitry  required  to  per¬ 
form  a  write  operation  Into  the  associa¬ 
tive  memory.  Prior  to  the  search,  the 
control  unit  will  define  both  string 
memory  and  associative  memory  by  shifting 
each  query  term  Into  an  appropriate  posi¬ 
tion  within  the  shift  regl  -ter  whereupon  a 
write  operation  may  be  executed. 


THE  INTERRUPT  GENERATION  UNIT 

When  the  term  detector  place*  an 
active  signal  on  the  match  11ns,  it  la  an 
Indication  to  the  rest  of  the  system  that 
ths  value  currently  on  the  address  but  la 
the  address  of  a  location  in  string  memory 
containing  a  required  term.  At  this  time, 
such  an  address  is  accepted  by  the  Inter¬ 
rupt  Generation  Unit  (IGU)  and  used  to  aid 
the  proceeslng  of  a  query  resolution. 
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The  activity  to  be  Initiated  by  this 
detection  it  defined  by  the  content*  of  a 
RAM  which  la  eatabllahed  prior  to  the 
search.  The  address  value  from  the  term 
detector  Is  used  to  access  a  6  bit  word 
which  Is  used  to  control  the  following  two 
activities : 

a)  Interrupt  enable 

If  this  bit  Is  eat,  an  Interrupt  sig¬ 
nal  Is  Issued  to  the  query  resolution  pro¬ 
cessor  (QRP).  The  QRP  can  then  act  on  the 
presence  of  the  indicated  tern  by  execut¬ 
ing  code  associated  with  the  resolution  of 
sons  particular  query. 

b)  Hardware  execution  of  the 
"threehold-or" 

Another  bit  In  the  1011  RAM,  the 
"threshold-or  enable",  Is  used  to  deter¬ 
mine  whether  or  not  the  detection  of  this 
term  la  to  be  accompanied  by  the  decre¬ 
menting  of  a  counter  which  Is  responsible 
for  the  maintenance  of  the  term  count 
associated  with  a  particular  threshold-o r . 
The  remaining  four  bits  of  the  word  select 
(via  a  decoder)  one  of  sixteen  counters. 
Each  counter  Is  programmable  and  can  be 
loaded  from  the  data  bus  coming  froa  the 
QRP .  Each  counter  la  four  bits  long,  and 
hence  the  maximum  threshold  allowed  In 
such  a  query  Is  16. 

Since  a  particular  term  In  any  docu¬ 
ment  must  decrement  the  selected  counter 
once  and  only  once,  a  separate  RAM  main¬ 
tains  a  "hlt-llst".  At  the  start  of  a 
document,  all  entries  In  this  RAM  ere  set 
to  zero.  When  a  term  is  first  detected 
(match  line  high)  the  presence  of  a  cero 
fTom  the  hit-list  and  a  one  from  the 
threshold-or  enable  bit  will  cause  the 
selected  counter  to  decrement.  This  cycle 
Is  Immediately  followed  by  a  cycle  which 
writes  a  1  bit  Into  the  hit-list  and  hence 
any  future  detection  of  the  term  within 
the  same  document  will  not  produce  an 
active  level  on  the  decode  enable  line. 

In  actual  practice,  It  may  be  neces¬ 
sary  to  duplicate  the  hit-list  facility 
since  It  must  be  cleared  between  docu¬ 
ments.  Consequently,  it  may  be  necessary 
to  clear  one  list  while  the  other  is  being 
used  . 

Finally,  It  should  be  noted  that  a 
pipeline  effect  enn  be  Incorporated  Into 
the  design,  Once  the  match  address  Is 
avsilable,  It  can  be  latched  for  use  by 
the  IGU  and  in  this  way  the  activity  of 
the  IGU  and  the  processing  of  the  next 
character  in  the  term  detection  unit  may 
be  overlapped. 


CONCLUSION 

We  have  presented  a  design  for  a  text 
scanner  which  uses  a  tern  detection  unit 
Incorporating  random  access  memory  and 
associative  memory  in  a  cost  effective 
manner.  An  additional  module,  referred  to 
as  the  Interrupt  generation  unit,  contri¬ 
butes  information  which  greatly  enhances 
the  system  implementation  of  high  level 
queries  such  as  the  threshold-or. 
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Abstract 

A  COBOL  Machine  applicable  to  an  attached  pro¬ 
cessor  has  bean  developed.  It  is  characterised  by 
having  intensive  COBOL  siachine  architecture 
(COMBAT) ,  highly-specialised  hardware  structure  and 
compact  and  efficient  host  processor  interface. 

COMBAT  architecture  has  many  facilities  for 
efficient  COBOL  program  execution)  many  internal 
data,  highly  functional  data  deecriptora  and  in¬ 
tensive  instructions.  COMBAT  machine  is  func¬ 
tionally  composed  of  three  processor  modules  (IFPM, 
OFPM  and  EXPM) ,  highly  specialised  for  their  func¬ 
tions. 

It  is  found  that  average  COLOL  statement  ex¬ 
ecution  time  is  35%  of  host  processor  execution 
time.  A  COMBAT  machine  attains  better  cost/per- 
formance  and  is  useful  for  a  specirl  COBOL  procis- 
oor  Attached  to  a  medium  or  large-scale  computer. 

Introduction 

Recent  advances  in  solid-state  technology  and 
software  crisis  due  to  increases  in  computer  appli¬ 
cations  are  accelerating  the  research  and  develop¬ 
ment  ot  high-level  language  machines.  From  the 
viewpoint  of  their  utilization  style,  high-level 
language  machines  are  clacsifled  into  two  catego¬ 
ries:  a  stand-alone  processor  and  an  attached 
processor  or  an  element  processor  of  a  distrlbuted- 
f unction  computer  syatoml.  Burroughs  111  7iJ0 2  and 
NCR  COBOL  Virtual  Machine-  am  typical  examples  ot 
a  stand-alone  high-level  language  machine.  PASCAL 
Microcngine4  from  'Western  Digital  Corp.  is  also  a 
recent  Interesting  product,  applied  to  microcom¬ 
puter  applications. 

On  the  other  hand,  current,  marked  decreases  in 
the  cost  of  hardware  and  advent  of  highly  function¬ 
al  processor  modules  make  it  not  only  technically 
feasible,  but  ecoromically  practical  to  develop  the 
attached  high-level  language  machine.  Taking  this 
trend  Into  consideration,  a  COBOL  machine  applica¬ 
ble  to  an  attached  processor  has  been  implemented. 

In  order  to  attain  better  cost  performance  in 
a  high-level  language  machine,  machine  architecture 
and  hardware  structure  design,  based  on  ao'u.il  user 
environment  are  important.  For  tins  pur  |  use,  an 
analysis  tool  Is  implemented.  The  analysis  tool 
gathers  COBOL  user ‘ s' program  profile,  including 
COBOL  verbs,  operand  d-*a  attributes  ami  so  on. 

With  the  help  of  this  tool,  a  COBOL  machine  archi¬ 
tecture,  highly  optimized  for  COBOL  proqram  pro¬ 
cessing,  and  a  COBOL  machine  hardware  structure, 


greatly  specialized  for  Its  machine  architecture, 
arc  obtained . 

The  COBOL  machine  can  effectively  execute 
major  COBOL  processing.  However,  input-output  op¬ 
erations,  noMMvnicstlon  control,  date  bees  manege- 
merit,  sof  bware— level  virtual  memory  management  and 
so  on,  ere  required  for  a  boat  processor.  There¬ 
fore,  in  e  high-level  language  machine  for  an  at- 
tached  processor,  highly  effective,  compact  and 
flexible  process  switching  mechanism  between  an 
attached  processor  end  a  host  processor  is  re¬ 
quired.  In  order  to  acooepliah  this  function,  ef¬ 
fective  connection  interface  at  the  internal  bus 
and  firmware  level  is  provided. 

COBOL  user's  programs  ere  translated  into 
highly  functional  COBOL  machine  instructions  by  ■ 
software  translator,  which  rune  on  a  host  proces¬ 
sor, 

Ae  an  evaluation  criterion  of  high-level 
language  machine  architecture,  IPF  (instructions 
Per  Function) ,  which  indicates  how  many  machine 
instructions  correspond  to  a  source  statement,  is 
selected.  IPF  means  machine  architecture  language 
proximity.  In  order  to  evaluate  IPF  value  and 
object  memory  capacity  per  a  COBOL  statement,  an 
evaluation  tool  Is  implemented. 

At  present,  the  COBOL  machinu  is  tunning  as  a 
processor  attached  to  a  host  processor,  in  which 
a  medium-scale  conventional  commercial  computer 
(NEAC  ACOS  series  77  Model  300)  Is  used  an  a  bane 
computer.  In  the  host  processor,  then-fore, 
FORTRAN,  PL/I  and  COBOL  program  execution  are  pos¬ 
sible,  as  wel •  ns  COBOL  proqram  compilation. 

As  a  result  of  this  attachment,  COBOL  progiam 
execution  in  the  host  processor  is  excluded  for 
the  COBOL  machine.  This  resu'lts  in  host  processor 
performance  enhancement  for  through-put  and  turn¬ 
around-time  . 

In  the  following  sections,  a  COBOL  machine 
architecture,  COMBAT  (COBOL  Oriented  Machine  Basic 
ArchiTecture)  ,  a  machine  hardware  structure,  host, 
processor  interface  and  evaluation  results  are 
doscr  tbad , 

System  Overview 

Figure  1  shows  COMBAT  system  coni  igui  .itiini, 
including  analysis  and  evaluation  tools.  Tile 
COMBAT  system  is  composed  of  COMBAT  translator  and 
COMBAT  machine  connected  to  a  host  proi  e;,sor . 

COBOL  programs  are  translated  into  luqhly 
functional  COMBAT  instructions  by  a  software 
COMBAT  translator,  whose  language  specification  is 


compatible  with  4  host  processor  (ANSI  14  COBOL5) 
for  practical  uaa  and  impartial  evaluation  of  the 
system.  The  higher  the  functional  level  of  a 
high-level  language  Machine  architecture  becoMea, 
the  eiapler  a  translator  becoaea .  A  tranalator  is 
composed  of  high-level  language  dependant  part  and 
target  machine  dependant  part.  In  the  COMBAT 
tranalator,  the  processing  time  and  memory  caper  ity 
for  the  latter  part  greatly  reduce  due  to  its  high 
functionality. 
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Fig.  I  COMBAT  System  Configuration  and 
Analysis/evaluation  Tools 

Hlqhly  functional  machine  architecture,  in¬ 
cluding  host  processor  interface  and  intensive 
hardware  structure,  closely  related  to  machine 
architecture,  are  required  for  an  attached  high- 
level  language  machine.  The  COMBAT  machine  can 
effectively  perform  malor  COBOL  functions  for  data 
manipulation,  table  handling,  arithmetic  opera¬ 
tions  and  conditional  operationn.  Moreover,  high 
performance  and  compact  hoat  procaasor  interface 
to  enable  execution  of  other  features  era  provided 
to  the  host  processor,  e.g.  input-output  opera¬ 
tions,  communication  control  and  virtual  memory 
management  at  rh"  software  level,  A  host  proces¬ 
sor  call  instruction  In  COMBAT  machine  realises 
this  function. 

COMBAT  machine  architecture  is  qrcatly  opti¬ 
mised  for  COBOL,  language  processing  in  order  to 


obtain  high  performance  at  the  machine  architec¬ 
ture  level.  Moat  COBOL  statements,  tharafora,  can 
ba  translatad  into  a  aingla  COMBAT  instruction. 
Various  formatn  of  Internal  data  directly  corre¬ 
sponding  to  all  user  definad  source  data  ara  pro¬ 
vided. 

COMBAT  Mchine  has  a  hardware  structure  spe¬ 
cial  ited  for  COMBAT  architactura,  which  is  mainly 
composed  of  three  functionally  distributad  proces¬ 
sor  Modules  ( I PPM 1  Instruction  Patch  Procaasor 
Module,  OPPMi  Operand  Fetch  Processor  Hodult  and 
EXPM:  Instruction  Executa  Procaasor  Module). 

Their  processor  modules  ara  also  apacialiaed  for 
their  functions  using  microprogramming  techniques 
and  powerful  hardware  components. 

Architactura  and  Hardware  Organisation 


Architecture 

A  Cobol  Oriented  Machine  Basic  Architactura 
(COMBAT  architecture)  has  bean  specified  to  obtain 
better  trade-offs  between  hardware  and  software  in 
high-level  language  processing.  In  high-level 
lanquage  machines,  it  ia  most  aignifioant  to  decide 
how  much  a  gap  ia  reduced  between  a  source  state¬ 
ment  and  a  machine  instruction.  In  order  to  attain 
better  performance,  the  machine  inatruction  amt  ia 
deflnnd  to  correspond  to  a  COBOL  source  statement 
as  ciosely  as  possible.  Therefore,  tha  following 
functions  are  performed  durlnq  a  machine  inatruc¬ 
tion  execution. 

(i)  Data  type  conversion  or  adjuatamnt. 

(li)  indexing  by  index  data  or  subscript  data, 
(til)  Editing  required  for  data  transfer  and 
arithmetic  operations. 

Machine  Inatruction  Format.  Moet  COBOL 
source  statements  are  translated  into  a  machine 
instruction  by  a  software  translator,  which  corre¬ 
sponds  to  a  conventional  compiler.  A  machine  in¬ 
struction  is  composed  of  operation  coda  and  operand 
syllables,  as  shown  in  Fig.  2.  If  necessary,  a 
variant  syllable  or  operand  number  syllable  is  ap¬ 
pended  to  the  operation  code.  Each  operand  syl¬ 
lable  represents  a  data  item.  When  the  operand  is 
an  element  in  an  array,  several  operand  syllable# 
are  necessary  to  specify  index  or  subscript  data 
items . 


MOVE  A  TO  B(I,J) 


\  'operand  Syllabla 

\  Operand  Numbtr  Syllabla 
Opt rat ion  coda 

Fiq.  2  Source  statement  and  Machina  Instruction 
Correspondence 


L  ..CI.  .t./u 


~u‘  Ji)  Hr,  ?r hxj! 


instruction*.  SEARCH  and  PERFORM  statement  func¬ 
tions  are  also  translated  into  several  instruc¬ 
tions. 


Oats  and  Descriptor.  COBOL  users  can  handle 
various  data  formats  In  a  COBOL  program.  Since 
there  are  only  a  few  data  formats  directly  manipu¬ 
lated  in  a  conventional  machine,  an  object  program 
should  convert  then  into  internal  formats  at  run 
time,  This  COMBAT  machine  provides  all  data  for-  Hardware  Configuration 

mats  required  in  the  AHSI  74  COBOL  specification. 

Table  1  lists  arithmetic  data  formats  as  an  exam-  The  function  performed  within  the  COMBAT 

pie.  machine  is  higher  than  that  for  conventional  ma- 

Descriptor  architecture  is  adopted  to  facili-  chines.  The  microprogramming  and  pipelined  archi- 
tate  more  complex  data  description  capability  for  tecture  is  suitable  to  effectively  realize  high 

decimal  scaling  and  editing  operations.  Simple  functionality.  Ir.  the  COMBAT  machine,  a  machine 

data  format  operand,  however,  cen  be  specified  instruction  sxecution  is  divided  into  three 

without  e  descriptor  to  avoid  performanca  dis-  phases,  instruction  fetch,  operand  fetch  and  ex¬ 
advantage  due  to  using  the  data  descriptor.  ecution.  Each  phase  is  executed  by  three  inde- 

CGBOL  language  allows  the  user  to  describe  pendent  processor  modules,  as  shown  in  1'ig.  3, 

very  complex  operation  In  a  statement.  If  it  is  Instruction  Fetch  Processor  Module  (1FPM),  Operand 

translated  into  a  single  machine  instruction,  the  Fetch  Processor  Module  (OFPM) ,  and  Instruction 

hardware  design  beoomes  too  complicated.  In  the  Execute  Processor  Module  (EXPM) ,  respectively. 

combat  machine,  complex  statements  are  divided  These  processor  modules  are  connected  with  each 

into  several  basic  operations.  For  example,  other  through  Firet-In-First-Out  (FIFO)  queue  memo 

EXAMINE  or  INSPECT  statement  functions  are  per-  ries.  This  configuration  is  intended  to.be  imple- 

formed  with  the  combination  of  TALLV  and  REPLACE  merited  with  VLSI  chips. 


Table  1  Arithmetic  Data  Formats  in  the  COMBAT  Machine 


Data  Format 


COBOL  Uaaga 


Signed  Binary  Short 

Signed  Binary  Long 

Singed  Pecked  Decimal 

Signed  Unpacked  Decimal 

Unsigned  unpacked  Decimal 

Leading  Signed  unpacked  Decimal 

Separate  Trailing  signed  Unpacked  Decimal 

Separate  Leading  Signed  Unpacked  Decimal 


COMP-1 
COMP-2 
COMP- 3 

COMP /DISPLAY  (SIGN  IS  TRAILING) 
DISPLAY  (MO  SIGN) 

DISPLAY  (SIGN  IS  LEADING) 

DISPLAY  (SIGN  IS  TRAILING  SEPARATE) 
DISPLAY  (SIGN  18  LEADING  SEPARATE ) 


COMBAT  MACHINE 


Fig.  3  COMBAT  Machine  System  Configuration 
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Instruction  Fetch  Processor  Module  (IFPM) . 

Tho  main  IFPH  role  in  to  generate  Internal  lot  ms 
correspondin'!  to  an  instruction  for  easy  following 
manipulations .  'the  operation  code  and  variant  syl¬ 
lable  are  packed  into  a  32-bit  internal  machine 
instruction,  nr  shown  in  Fiq.  4,  and  transferee!  I  ■ 
OFPM  and  EXPM  through  machine  instruction  FIFOs. 

The  operand  syllable!,  are  also  packed  into  a  72-hit 
internal  data  descriptor  for  each  operand.  Within 
thia  process,  indexing  and  subscripting  are  resolved 
and  an  effective  operand  address. is  located  in  the 
Internal  data  descriptor.  Another  important  role 
is  to  control  the  COBOL  program  execution  sequence. 
Normally,  IFPM  continues  prefetching  according  to 
the  sequence  represented  by  such  as  GOTO,  IF  and 
PERFORM  statements. 

Internal  Machine  Instruction 


0  31 

|  OP 

VAR 

■ 

3 

Internal  Data  Descriptor 


0  71 


IN 

ON 

TYPE 

ATTRIBUTE 

LOGICAL  ADDRESS 

Nt  Number  of  operands 
IN i  Instruction  Number 
ON i  operand  Number 

Fig.  4  Internal  Machine  Instruction  and  Data 
Descriptor  Format 

i n_<‘ rand  Fetch  Processor  Module  (OFPM)  .  The 
main  OFPM  role  to  prepare  operand  data  fnt  FXPM, 
including  dul  l  I •  •  1 1 -l» ,  validity  check  and  data  l«g  • 
mat  cnnvi-i  iil'in.  UPPM  fetches  data  trom  a  mam 
memory.  Data  contents  are  examined  to  validate 
them.  Then,  detailed  operand  attributes  are  tel 
into  an  internal  data  descriptor.  For  example,  it 
is  determined  whethet  data  is  positive,  negative, 
all  S[ see,  zero,  alphabetic  or  numeric.  When  an 
operand  data  in  used  in  an  arithmetic  operation, 
ofpm  converts  ■  ..  into  one  of  two  internal  data 
formats,  Sign, -I  Mlnary  Long  or  Unsigned  Packed 
Decimal,  In  order  to  he  easily  manipulated  in  FXl'M. 

Instruction  Execute  Processor  Module  (F.XPMI  . 
FXPM  performs  instruction  execution  as  a  final 
stage  in  a  pipeline.  To  achieve  hiqh  performance, 
FXPM  installs  specially  designed  hardware  units 
Especially  transfer  and  editlnq  operations  are 
performed  efteetivily  with  t.hr  aid  of  these  spe.  lal 
hardware  units,  because  these  operations  are  most 
frequently  used  in  coROt.  programs. 

These  processor  modules  are  composed  of 
bipolar  bit-slice  sequencers  (AMP  290ii  series). 
Their  Instruction  cycle  time  is  200  nsec.  IFPM 
and  OFPM  micro  instruction  lenqth  is  48  bits  and 
EXPM  is  72  bits  long.  Control  storage  sizes  for 
IFPM,  OFPM  and  EXPM  are  IK,  2K,  3K  words,  respec¬ 
tively.  IFPM  and  OFPM  are  implemented  with  37  and 
42  boards,  on  which  a  maximum  of  80  ICs  can  bo 
installed.  An  EXPM  is  implemented  with  25  boards, 
which  can  install  a  maximum  of  200  ICs. 


Host  Processor  Interface 

Tin-  system  is  composed  of  the  COMBAT  machine 
and  a  host  processor,  as  shown  in  Fig.  3.  In  this 
section,  the  interface  between  these  two  proces¬ 
sors  i'i  described.  A  POlini,  source  program  must  be 
translated  into  a  COMBAT  object  program,  before 
the  program  is  processed  on  the  COMBAT  machini  . 

The  COMBAT  machine  executes  COBOL  language  pro¬ 
cessing  functions  independently  from  the  host  pro¬ 
cessor.  T7ie  host  processor  is  responsible  for 
this  translation  and  also  for  miscellaneous  func¬ 
tions.  For  example,  I/O  statements  (0PEN/CU3SE/ 
DISPLAY) ,  inter  program  control  statements  (CALL/ 
EXIT  PROGRAM)  and  communication  control  statements 
(SEND/RECEIVE)  are  categorised  as  such  functions. 
These  statements  are  translated  into  HOST  CALL 
i ns 1 1  net. ions  by  the  translator. 

The  COMBAT  machine  is  physically  connected  to 
a  Im  . i  prm-i-SKot  by  a  shared  main-memory  interface 
.111-1  .1  -111, lied  bus  interface.  Data  and  program  code 
an-  i.  ,--,-  eil  by  two  prncuiisoi  s  through  the  shared 
mu i ri-memot y  interface.  Centro!  signals  are  trans- 
fercl  thiough  the  shared  bus  interface. 


Shared  Main-Memory  Interface 

Mom-memory  can  be  accessed  by  both  the 
C'iM8AT  machine  and  the  host  processor,  bats  and 
programs  are  located  on  a  host  virtual  storage 
space  as  a  unit  of  segment.  Therefore,  it  is 
necessary  to  translate  the  virtual  address  into  a 
real  memory  address,  every  time  a  segment  is  ac¬ 
cessed.  The  COMBAT  machine  has  sn  address  trans¬ 
lation  mechanism  called  Memory  Processor.  For 
high  •-peed  translation,  the  Memory  Processor  has  8 
p.iii-.  .it  virtual  and  rial  address  mapplnq  registers 
in  ron  luiu-t  inn  with  .m  associative  memory  device, 
once  a  segment  is  accessed  and  the  address  trans¬ 
lation  has  been  performed,  the  address  napping 
registers  contents  are  effective,  as  long  as  the 
segment  stays  at  a  certain  real  memory  location. 

When  the  segments  have  been  relocated  by  Vir¬ 
tual  Memory  Manager  (VMMi  runs  on  a  host  proces- 
sot  )  ,  tin  address  mapping  tegisters  contents  must 
be  •  haro-l.  Moreover,  when  the  COMBAT  machine 

. . .lies  i  seamen t  which  is  not  in  the  main-memory, 

tin-  segment  must  be  moved  to  the  main-memory  from 
the  secondary  storage.  The  host  processor  executes 
tins  function  for  the  COMBAT  machine  (VMM  CALL). 


■  •liui'-d  Bus  Inlet  face 

Most  pmcesiior  bus  is  directly  connected  to 
tli"  combat  machine.  To  control  the  bus,  s  special 
host  machine  instruction,  named  SUPERVISE  COMBAT, 
is  '-rovlded  in  the  host  processor.  This  instruc¬ 
tion  is  developed  into  a  host  micro-code,  called 
COMBAT  Support  Firmware.  Its  process  flow  is 
shown  in  Fiq.  5.  Under  the  control  of  the  COMBAT 
Support  Firmware,  information  can  be  transfered  to 
and  from  the  COMBAT  machine  through  the  bus. 
Therefore,  transfering  is  possible  if,  and  only 
if,  the  host  processor  is  executing  the  SUPERVISE 
COMBAT  instruction. 

215 


.... 


^  ENTRY  ^ 


halt  COMBAT 

MICROPROGRAM 

BRANCH 


halt  COMBAT 
(HOST  CAM.) 


At  the  beginning  of  a  COBOL  program  process, 
OMBAT  Support  Program  prepares  the  execution  en¬ 
vironment.  Segment  addresses  are  loaded  in  base 
registers  (BRo)  and  other  ini  urination  in  general 
registers  (GRs) .  Especially,  BR4  is  set  to  the  top 
address  of  the  COMBAT  code  segment,  and  OKU  con¬ 
tents  are  cleared.  URO  is  used  as  a  flag  t6  con¬ 
trol  the  COMBAT  Support  Firmware  execution. 

Then,  a  host  processor  executes  the  SUPERVISE 
COMBAT  instruction,  that  is,  the  COMBAT  Support 
firmware  runs.  COMBAT  Support  Firmware  generates 
an  initialisation  signal  to  tha  COMBAT  machine,  and 
transfers  the  segments'  information  through  the 
shared  bus  interface.  The  BR4  contents  are  trans- 
fered  to  the  COMBAT  machine's  instruction  counter. 


halt  COMBAT 

MICROPROGRAM 

BRANCH 


EXIT 


D 


Fig.  5  COMBAT  Support  Firmware 


COMBAT  Machine  and  Host  Processor  Interaction 

A  translator  program, runs  on  a  host  processor, 
generates  four  kinds  of  segments.  Three  of  them 
are  mainly  accessed  by  the  COMBAT  machine:  COMBAT 
object  code,  COMBAT  descriptor  and  COMBAT  data 
segment.  The  other  kind  is  a  host  object  code 
segment  called  COMBAT  Support  Program.  It  includes 
tha  SUPERVISE  COMBAT  instruction  and  other  codes 
for  the  execution  of  functions  to  be  processed  on 
the  host  processor  mentioned  above.  The  COMBAT 
Support  Program  structure  is  Bhown  in  Fig.  6. 
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Fig.  6  COMBAT  Support  Program 
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Then,  the  COMBAT  machine  starts  to  fetch  the  COMBAT 
instructions  and  descriptors  from  the  segments, 
prepared  for  the  COMBAT  machine,  through  the  shared 
main-faemory  interface.  After  that,  COMBAT  Support 
Firmware  enters  a  microprogram  loop  until  an  inter¬ 
ruption  condition  occurs,  either  on  the  COMBAT 
machine  or  on  the  host  processor. 

Mtien  an  interruption  condition  occurs,  the 
COMBAT  Support  Firmware  halts  its  supervising  pro¬ 
cess.  At  this  time,  the  COMBAT  machine  cannot  ac¬ 
cess  a  new  segment  or  generate  a  new  HOST  CALL 
instruction  to  a  host  processor,  until  the  COMBAT 
support  Firmware  restarta  its  process.  However, 
other  processes  inside  the  COMBAT  machine  can  be 
executed  continuously. 

A  host  interruption  causes  a  microprogram 
branch  to  an  interruption  process  part.  After  in¬ 
terruption  process  completion,  microprogram  control 
raturns  to  the  COMBAT  Support  F irsware  and  restarts 
the  COMBAT  supervising  process,  or  dispatches  to 
another  host  procsss.  In  ths  lattsr  csss,  contents 
for  bssa  registers  end  general  raglstsrs,  related 
to  the  COMBAT  execution,  must  be  saved.  This  pro¬ 
cess  is  necessary  for  multi -programing  control. 

COMBAT  machine  brings  about  an  intarruption  in 
two  caaaa.  One  is  when  the  COMBAT  machine  requires 
access  to  a  segment  which  is  not  in  the  main- 
memory.  in  this  casa,  the  COMBAT  Support  Firmware 
stops  its  supervising  process  and  calls  a  host 
Virtual  Msanry  Manager  eoftware  routine  to  move  the 
segment  to  the  main-swmory  from  secondary  storage. 
The  other  interruption  occurs  when  the  COMBAT 
machine  encounters  a  HOST  CALL  instruction  in  the 
COMBAT  code  eegswnt.  This  time,  the  COMBAT  Support 
Firmware  completes  its  execution,  and  the  COMBAT 
Sdpport  Program  takes  a  host  machine  cycle. 

Nest  to  the  SUPERVISE  COMBAT  Instruction  in 
the  COMBAT  Support  program  Is  an  analysis  routine 
for  the  HOST  CALL  parameter.  The  parameter  is 
fe'tchsd  from  the  COMEAT  cods  ssgsMnt  through  the 
shared  main-memory  interface.  According  to  the 
analysis  result,  COMBAT  Support  Program  executes 
one  of  the  functions  to  be  executed  by  the  host 
processor  described  before,  e.g.  EXIT  program, 

SEND,  display,  etc.  After  the  HOST  CALL  instruc¬ 
tion  execution,  the  COMBAT  Support  Program  seta 
the  GRO  and  executes  the  SUPERVISE  COMBAT  instruc¬ 
tion  again.  Detecting  that  the  value  in  GRO  is  not 
equal  to  aero,  the  COMBAT  Support  Firmware  skips 
the  Initiation  phase  and  continues  its  supervising 
process.  If  the  HOST  CALL  instruction  was  a  STOP 
RUM  or  ERROR  instruction,  COMBAT  Support  Program 
stops  its  execution. 

Evaluation  Results 

The  COMBAT  system  is  evaluated  from  the  as¬ 
pects  of  translation  from  a  COBOL  program  to  COMBAT 
machine  instructions  and  their  execution.  For  this 
purpose,  combat  translator  and  COMBAT  machine  ex¬ 
ecution  are  compared  with  the  host  COBOL  compiler 
and  host  processor  instruction  exscution,  respec¬ 
tively.  in  order  to  clarify  the  affect  of  an  at¬ 
tached  high-level  language  machine,  an  attempt  was 
made  to  determine  how  much  work  load  is  excluded 
from  the  host  processor. 

As  an  evaluation  meanure  at  the  COBOL  program 
level,  five  typical  user  program#  were  chosen. 

Also,  for  COBOL  statement  level  evaluation,  a  COBOL 


statement  mix,  consisting  of  15  typical  COBOL 
statements,  was  selected,  based  on  actual  user  ap¬ 
plication  programs. 

Translator  Evaluation 

In  ordar  to  clarify  the  difference  between 
COMBAT  and  host  machine  architectures,  instruc¬ 
tions  par  function  (IPF)  have  been  measured. 

COMBAT  machine  and  host  processor  IPFa  for  the 
statement  mix  are  1.7  and  5.5,  raspaotivaly .  Thaaa 
valuas  show  ramarkabla  COMBAT  architactura  proximi¬ 
ty  to  COBOL  sourca  statements .  me  COMBAT  archi¬ 
tactura  brings  the  following  effects  on  COMBAT 
translator. 

•  Translator  program  memory  rsduction 
■  Decrease  in  translation  time 
object  program  memory  reduction 

Improvement  degree  for  theae  affaots  is  in¬ 
fluenced  by  translation  procassor  unit,  machine 
architecture,  translator  daacription  language  and 
translator  dasign  algorithm.  In  ordar  to  evaluate 
the  difference  between  COMBAT  and  boat  machine 
architectures,  COMBAT  translator  is  oompoasd  In  the 
same  way  aa  the  host  compiler,  except  for  the  coda 
generation  phase.  Thaaa  effects  are  evaluated  with 
five  COBOL  user  programs,  collected  from  various 
application  areas.  Table  2  shows  the  results  of 
the  COMBAT  translator  performance,  compared  with 
the  host  compiler. 

Table  2  COMBAT  Translator  Performance 


Both  COMBAT  translator  and  the  host  cosq>ilar 
are  divided  into  pre-code  generation  part  and  code 
generation  part.  The  pre-code  generation  part 
dasign  is  dapandsnt  on  ths  sourca  language  and  in¬ 
dependent  of  the  object  machine  architecture.  On 
the  other  hand,  the  coda  generation  part  design 
depends  on  the  object  stachine. 

Translator  Program  Capacity.  The  instruc¬ 
tions  par  function  for  the  COMBAT  architactura  is 
markedly  reduced.  Therefore,  the  code  generation 
part  capacity  is  19«  lass  than  the  host  part  ca¬ 
pacity.  in  spits  of  prsparing  unique  functions  for 
COMBAT  architacture.  The  unique  functions  include 
generation  of  data  descriptors,  multi-operand 
instructions  and  host  proctssor  codas.  Ths  COMBAT 
translator  pre-coda  generation  part  memory  capacity 
is  alsnst  the  samo  as  that  for  ths  host  compiler, 
because  of  their  source  language  dependency  and 
object  machine  architecture  independency.  Total 
memory  capacity  in  COMBAT  translator  becomes  6% 
lass  than  the  host  compiler . 


Translation  Time.  COMBAT  tiatmlator  cxecu- 
tion  time  is  measured  with  software  monitor  and 
compared  with  the  host  compiler,  as  shown  in  Table 
2.  COMBAT  translation  time,  in  code  generation 
part  and  whole  translator,  reduce  to  GG%  and  92%, 
respectively . 

Object  Program  Capacity.  COMBAT  object  pro¬ 
gram  capacity  reduces  to  59%  of  the  host  object 
program,  as  shown  in  Table  2.  Object  program  ca¬ 
pacity  effects  the  performance  in  executing  the 
object  program!  from  the  effective  memory  use 
aspect.  This  memory  reduction  brings  about  good 
effect  on  program  locality. 

Execution  Time  Evaluation 


Kill'll  execution  in  each  processor  Is  i.ali/.i'd  l.y 
special  hardware  units,  consisting  ot  high  speed 
register  files,  prograrwsable  logic  arrays,  etc. 

COMBAT  performance  improvement  in  the  COBOL 
statement  mix,  in  which  most  statements  are  vet y 
simple,  is  limited  by  memory  access  operation,  as 
shown  m  the  memory  usage  ratio.  Highly  functional 
COMBAT  architecture  and  extensive  COMBAT  hardware 
are  not  sufficiently  utilized  in  this  situation. 

On  the  other  hand,  a  COMBAT  machine  has  highly  ef¬ 
ficient  machine  instructions  for  complex  COBOL 
statements,  ’STRING1  and  1  INSPECT’  and  for  complex 
data  attribute  manipulation,  like  a  subscript  and 
decimal  point  scaling.  Therefore,  COMBAT  perform¬ 
ance-  improvement  becomes  larger  for  these  complex 
■;  ta  lenient  s . 


Average  Statement  Execution  Time.  The  aver¬ 
age  statement  execution  times  in  COMBAT  machine 
and  the  host  processor  are  evaluated.  Memory  ac¬ 
cess  time  and  memory  usage  ratio  are  also  evaluat¬ 
ed.  These  evaluation  results  are  shown  in  Taloe  1 


Table  3  Execution  Performance  for  Statement  Mix 


COMAT 

Host 

Average  Statement  Bneoution  Time 

0.15 

1.00 

Meaory  Ueage  Ratio 

70% 

40% 

Memory  Acoees  rime 

o.ao 

1.00 

COMBAT  average  statement  execution  time  be¬ 
comes  one  third  in  comparison  witn  the  host  aver¬ 
age  statement  execution  time.  The  major  ruaaons 
for  this  COMBAT  machine  performance  improvement  are 
considered  as  beinqi 

1)  Machine  architecture 

Highly  efficient  COMBAT  architecture  leads  to 
less  instruction  fetching  and  data  accessing  opera¬ 
tions,  due  to  compact  object  code,  as  shown  in 
Table  2,  For  example,  most  literal  data  are  di¬ 
rectly  described  within  the  instruction  and  sub¬ 
scripted  data  address  is  calculated  using  a  data 
descriptor.  As  a  result,  literal  and  subscripted 
data  are  efficiently  accessed. 

2)  Hardware  configuration 

Memory  access  time  from  each  COMBAT  processor 
becomes  longer  than  that  from  the  host  processor. 

In  order  to  improve  COMBAT  memory  access  time,  a 
cache  memory  is  provided.  Memory  access  time  from 
the  COMBAT  machine  with  the  cache  memory  reduces  to 
80%  of  that  from  the  host  processor,  as  shown  in 
Table  3.  In  spite  of  this  memory  access  improve¬ 
ment,  the  COMBAT  memory  access  ratio,  70%,  is  high¬ 
er  than  the  host  processor  memory  usage  ratio,  40%. 
This  high  usage  ratio  is  accomplished  due  to  COMBAT 
machine  pipeline  configuration,  in  which  each  pro¬ 
cessor  independently  generates  memory  requests  and 
rapidly  executes  each  part  of  a  COMBAT  instruction. 


Application  Program  Execution  Time.  In  order 
to  make  a  program-level  evalution,  COMBAT  machine 
execution  times  for  application  programs,  including 
l nput/output  and  other  exclusive  operations,  are 
compared  with  the  host  processor  execution  times. 

In  addition,  for  clarification  of  effects  due  to 
the  attached  COBOL  machine,  through-put  and  turn¬ 
around  time  improvements,  for  application  programs 
in  the  host  processor,  are  measured. 


Conclusion 


A  COBOL  machine  architecture  (COMBAT  irchi tec- 
i  in t- 1  ,  a  COBOL  machine  hardware  structure  (COMBAT 
machine)  and  several  evaluation  results  have  been 
presented.  The  COMBAT  architecture  and  COMBAT 
machine  structure  are  specified  to  become  optimum 
from  both  machine  architecture  and  hardware  design 
sides.  The  COMBAT  architecture  is  highly  optimized 
for  COBOL  program  processing.  The  COMBAT  machine 
is  greatly  specialized  for  the  COMBAT  architecture. 

In  additioni  the  COMBAT  machine  is  aimed  to  bo 
mainly  a  COBOL  machine,  attached  to  a  host  proces- 
soi .  Therefore,  effective  and  compact  host  pro- 
.-ossur  interface  is  provided. 

As  a  result,  of  architecture  optimization  and 
hardware  specialization,  a  highly  efficient  and  low 
cost  COBOL  machine  was  obtained.  Moreover,  simpler 
and  higher  performance  software  translator  than  a 
conventional  compiler  was  attained,  due  to  high 
COMBAT  architecture  functionality. 

It  was  found  that  tile  COMBAT  machine  is  useful 
for  a  special  COBOL  processor  attached  to  a  medium 
or  large-scale  commercial  computer.  In  addition, 
tile  COMBAT  machine  is  applicable  to  use  as  an  ele¬ 
ment  processor  for  a  distributed-function  computer 
sys tea. 
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Abstract 

This  paper  is  divided  into  two 
parte.  In  first  part,  using  the  method 
of  recursive  definition,  we  describe  the 
machine  Language  and  give  proper  expla¬ 
nations.  In  second  part,  we  briefly 
discuss  the  main  parts  of  implementation 
of  this  machine  language.  ITe  don't  at¬ 
tempt  to  use  this  machine  language  for 
replacement  of  concrete  design  of  the 
computers,  and  only  in  principle,  give 
a  discussion  of  the  unite  which  must  be 
altered  to  match  this  machine  language. 
Since  time  and  apace  are  limited,  it  is 
only  a  brief  discussion.  The  first  part 
can  be  referenced  by  users  and  the 
second  pert  can  be  referenced  while 
designing  machines. 
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Introduction 

Generally,  the  languages  inside  the 
machines  are  different  from  those  outside 
the  machines.  The  internal  languages  pay 
more  attentions  to  considering  enginee¬ 
ring  factors,  for  example,  having  power¬ 
ful  capability  of  eolwing  problems,  sa¬ 
ving  devices  and  so  on.  The  external 
languages  are  to  consider  the  r&cility  for 
use.  for  instance,  ALGOL  is  used  to  pro¬ 
vide  the  facility  for  scientific  computa¬ 
tion  users  and  COBOL  is  used  to  provide 
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the  facility  for  the  commercial  users;. 

Naturally,  people  can  consider, 
firstly,  whether  we  can  slightly  modify 
common  aachine  language  (e.g.  the  lan¬ 
guage  used  in  the  coaputer  104)  to 
bring  the  further  facility  for  the  users, 
and  secondly,  whether  we  can  properly 
aodify  coicon  external  language  (e.g. 
ALGOL)  to  execute  it  directly  by  coaputer 
without  adding  too  aany  devices. 

In  this  paper  we  chiefly  discuss  the 
second  point,  and  the  first  point  is 
discussed  in  the  paper  "A  aachine  lan¬ 
guage  ridding  of  the  dependency  of  ad¬ 
dress.  F 

All  of  the  discussions  are  preli¬ 
minary  and  speclaliatic  and  are  not  for 
being  used  directly  in  the  computers. 

We  can  imagine  that  we  make  e  line  to 
link  two  teroinals-One  is  general 
aachine  language  and  another  la  ALGOL, 
the  first  point  being  near  the;  first 
-teminal  and  the  second  point  near  the 
second  terminal.  For  a  specified 
aachine,  there  are  a  lot  of  points 
(l.n.  schemes)  to  be  chosen,  we  must 
choose  it  according  to  concrete  condi¬ 
tions.  For  example,  the  price  of  the 
components  is  very  low,  the  reliability 
of  the  components  is  very  high  and  there 
is  m  associative  memory  and  so  on.  All 
of  these  should  be  considered  as  engi¬ 
neering  technical  conditions.  ALGOL  can 
be  chosen  as  a  machine  language  for  a 
special  scientific  computation  aachine. 

So  called  computer  design,  essen¬ 
tially,  is  to  choose  schaaes  based  on 
considering  special  use  requirments  and 
technical  conditions.  Whether  the  scheme 
is  good  or  not  depends  not  only  on  the 
rightness  of  the  choice  but  also  on  the 
size  of  the  choice  set.  In  this  paper, 
we  discuss  three  sorts  of  expressions 
Instead  of  one  sort.  For  a  particular 


aachine,  people  can  arbitrarily  ohoose  one 
sort,  two  aorta.  Or  all  the  three  aorta. 
After  we  have  theae  three  aorta  of  syn¬ 
thetic  schemes,  we  can  make  it  easy  to 
determine  which  one  we  prefer  asong  the 
seven  possible  schemes. 


*  This  paper  was  published  in  CHINA 
in  1963. 
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ABSTRACT 

Thts  paper  proposes  a  Tagged  Archltor"  :o  lor  PASCAL 
oriented  computer  architecture.  Ail  variables  are 
associated  to  Variable  Descriptors  and  all  data  ty¬ 
pes  are  described  by  Type  Descriptors.  The  propo¬ 
sed  instruction  set  is  directly  defined  from  HLL 
statements,  ordering  the  expressions  In  a  Polish 
form  and  keeping  inside  the  computer  the  control 
structure  defined  by  PASCAL.  A  hardware  computer 
is  next  presented  which  executes  the  above  code  by 
means  of  five  specialized  microprogrammed  processors 
working  in  a  pipelined  manner. 


INDEX  TERMS 

High  Level  Language  computer  architecture,  Tagged 
architecture,  Self-identifying  duta,  Polish  nota¬ 
tion,  Pipelined  execution. 

INTRODUCTION 

A  first  section  in  this  paper  presents  a  new  ap¬ 
proach  for  the  definition  of  an  instruction  set  and 
data  representation:  it  is  based  on  the  principles 
of  tagged  architecture.  The  second  section  briefly 
presents  the  currently  built  PASC-HLL  computer  that 
executes  the  machine  language  presented  in  Section  1 


A  PASCAL  prociranmor  is  allowed  to  "define"  Ills  own 
data  types  (so-called  Software  types)  by  structuring 
basic  types  (so-called  Hardware  types) .  Next  he  may 
"declare",  inside  each  procedure,  a  set  of  local  va¬ 
riables.  Finally  he  writes  his  program  as  a  struc¬ 
tured  sequence  of  PASCAL  instructions  manipulating 
his  variablas.  Such  a  simple  description  of  PASCAL 
programming  dirsctly  leads  to  s  simple  architecture 
for  a  PASCAL  oriented  computer  architecture  :  its 
instruction  set,  so-called  PASC-HLL,  can  be  reduced 
to  a  manipulation  of  the  programmer-defined  variables 
and  the  internal  operations  can  be  directed  by  the 
programmer-defined  data  types.  Such  an  approach  is 
that  followed  by  K. JENSEN  whan  defining  the  P.code 
(a  Pseudo-Code  for  an  hypothetical  stack  computer 
fll).  Several  Implementation  of  P.  code  interpre¬ 
ters  are  available  on  mini  or  microcomputers,  but 
none  of  them  really  implements  the  original  P.Code 
which  is  based  on  a  TAGGED  architecture.  Moreover 
P.Code  is  rather  far  from  PASCAL  for  its  control 
instructions:  the  original  PASCAL  syntax  no  more 
exists  in  P.Code.  So  we  propose  to  keep  PASCAL  con¬ 
trol  structures  in  PASC-HLL  to  simplify  debugging 
and  introduce  a  new  kind  of  software  reliability, 
by  providing  a  computer  that"knows"  an  expresion, 
an  Il.-Then-Else  or  control  loop  structure  :  it  is 
a  Syntax-oriented  architecture  flQ]. 

II  -  THE  PASC-HLL  DATA  STRUCTURES 
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SECTION  I  -  PASC-HLL  LANGUAGE  DEFINITION 


I  -  INTRODUCTION 

Classical  machines  are  working  on  typeless  data 
considered  as  a  collection  cf  binary  digits  struc¬ 
tured  as  bytes  or  words.  Except  for  Integers  or 
characters,  there  is  no  direct  relation  between  the 
H.L.L.  data  types  and  the  hardware  types  (the  ones 
known  by  the  machine) .  It  is  clear  that  data  type 
definition  is  the  most  interesting  characteristics 
of  PASCAL  language:  so  It  seems  to  be  important  to 
emphasize  the  problem  of  "making  the  hardware  suit 
the  language,  i.e.  to  define  hardware  data  type., 
that  suit  the  PASCAL  ones.  Moreover,  we  try  to  de¬ 
fine  an  Instruction  Set  which  suits  PASCAL,  in  ttie 
way  that  it  could  be  the  simplest  and  the  most  com¬ 
pact  executable  code  compiled  from  PASCAL,  keeping 
its  structured  programming  feature  inside  the  ma¬ 
chine  itself.  _  _ 

*  Project  supported  by  French  contract  SESORI  n“78-204 

222 


Using  the  principles  of  self-identifying  representa¬ 
tion,  we  propose  a  TAGGED  architecture  [2]  with  a 
basic  entity  :  the  Variable  Descriptor  V.D.  associa¬ 
ted  to  “nch  declared  variable.  When  accessing  a 
V.D. ,  the  machine  roust  get  enough  information  to  per¬ 
form  an  operation  specified  by  the  programmer.  A 
fixed  format  was  chosen  to  match  with  either  16  bits 
or  12  bits  words  and  to  simplify  V.D.  addressing  : 

V.D.3  (B  -  bit  TAG, 8  -  bit  STYPE,  16  -  bit  SVALUE) 

TAG  =  (I  Bit,  P  bit,  2  -  bit  S,  4  -  bit  HT) . 

X 1 . 1 .  Description  of  the  variable  descriptor  format 

a/  V.D. 
program 


Is  the  basic  information  referenced  by  the 
it  allows  the  machine  to  get  all  informa¬ 
tion  about  the  associated  variable.  Field  TAG  givrs 
all  the  hardware  description  :  firstly,  the  Hardware 
Type  is  sneoded  in  4  -  bit  HT,  indicating  one  among 
16  basic  types  known  by  the  machine  (they  are  listed 
in  Table  1).  If  the  variable  is  a  PASCAL  pointer, 
ttrm  bit  P  is  set.  Bit  I  indicates  wether  the  value 
of  the  variable  is  present  in  field  SV  or  not  i  if 


-i  .J'i 


not,  field  SV  hold*  an  addrat*  relative  to  a  segment 
whose  number  is  qiven  by  field  S,  according  to 
Table  2. 

to/  Field  ST  (Software  Type*)  hold*  an  index  in  the 
Software  Type  Table  where  til  the  programmer  defi¬ 
ned  type*  are  described.  When  ST  is  ZERO,  there  is 
no  Software  Type,  to  the  variable  type  i*  the  hard¬ 
ware  one,  for  example  an  8  -  bit  integer  with  hard¬ 
ware  bounds  (-128>  +127).  An  example  of  software 
type  descriptor  is  given  in  Fig,  1,  showing  an  ARRAY 
type  descriptor  holding  Lower  and  Upper  Bounds,  Ela- 
ment  Sise,  Element  TAG  and  STYPE  and  finally  the 
Array  Site. 

All  these  informations  are  pointed  to  by  the  ST 
field  of  an  ARRAY  variable  whose  access  allows  the 
machine  to  compute  different  operations  depending 
on  the  operator.  As  an  example,  Bound  Checking, 
address  calculation  and  building  of  a  Variable  Des¬ 
criptor  for  the  indexed  element  for  the  INDEX 
operator  t 

is  LB  S  Index  s  UB  7 

if  yes  then  t  SV  i •  SV (lndsx-L8) *STEP 
TAG  i»  Element  TAG 
ST  :»  Element  ST 

c/  Field  SV  (Software  Value)  holds  either  the  value 
of  the  variable  (if  it  aan  be  represented  by  1C  bits) 
or  the  address  of  it  in  the  other  cases i  i.e.  for 
long  values,  for  indirect  values  necessary  before 
an  assignment  or  for  structured  types  (arrays,  re¬ 
cords  or  files) .  A  particular  case  is  that  of  PASCAL 
pointers i  their  value  is  an  address,  so  bit  P  is 
set,  and  bit  I  is  set  or  not  depending  on  wether  the 
pointer  value  is  present  or  not  in  SV. 

II. 2.  The  pasc-hll  stack  mechanism 

Since  PASCAL  is  a  block-structured  language,  the 
PASC-HLL  machine  requires  to  have  a  stack  mechanism 
for  nesting  procedure  during  execution.  If  a  BASE 
register  is  associated  to  each  Lexical  Level,  it  is 
well-known  that  the  Internal  Nairn  of  any  variable 
can  be  built  as  a  couple  (Lexical  Level,  Displace¬ 
ment)  ■  (LL,D) ,  and  that  thla  name  can  be  used  du¬ 
ring  exacution  to  acceaa  the  Variable  Descriptors. 
Previous  implsmentations  of  that  structure  are  well- 
known  (Burroughs  [3],  HUrS  [4],  etc...).  However, 
it  is  Important  to  note  that  parameters  must  be 
coneidered  aa  local  variables  inside  a  called  pro¬ 
cedure,  but  they  must  be  evaluated  in  the  context 
of  the  calling  procedure.  So  we  define  two  steps: 
firstly  a  Procedure  Variable  Descriptor  is  accessed 
by  a  CALL  (LL,D)  instruction  which  allows  the  machi¬ 
ne  to  fetch  and  stora  the  Formal  Parameter  Descrip¬ 
tors.  Next  actual  paraamtars  can  ba  evaluated  and 
aasignad,  after  a  possible  conversion.  Finally, 
another  Instruction  so-called  ENTER  can  check  that 
the  correct  number  of  parameters  was  assigned,  next 
It  computes  the  Hark  Stack  Control  Word  [3]  and 
fetches  the  Local  Variable  Descriptors  before  ente¬ 
ring  the  procedure  code.  Fig. 2  describes  the  stack 
mechanism. 

Such  a  structure  allow*  a  aiaple  and  compact  addres¬ 
sing  mechanism:  a  positiv*  displacement  (00.. 63)  for 
local  variable*  and  a  negative  one  (-64.. -3)  for  pa¬ 
rameters  is  associstsd  to  the  Lexical  Level  to  form 


the  Variable  Internal  Name.  Pig. 3  gives  the  VIN 
encoding. 

it  is  now  possible  to  define  the  Access  Instructions 
whose  operand  is  a  variable  Name:  a  6-bit  opcode  is 
associated  to  a  10-bit  Variabta  Namefor  4  basic 
instructions:  REF  and  I HDD  allows  th*  machine  to 
access  a  Variable  Descriptor  (with  an  indirection  in 
the  case  of  XNDD) ,  ASSIGN  asks  th*  machine  to  assign 
a  new  value  to  the  variable,  and  CALL  allows  th* 
access  to  a  Procedure  Variable  Descriptor,  other 
miscellaneous  access  Instructions  ars  CLEAR,  SET, 

1NCR,  DECK  to  ioplsamnt  frequently  used  operations 
on  variables  such  as  I:  -  1+1  or  It  ■  0  (see  table  3). 
Now,  w*  can  show  th*  simple  I-PASCAL  language. 


HI  *  THE  PASC-HLL  LANGUAGE  STRUCTURE 

We  have  just  praaanted  access  instruction*  and  as¬ 
signments.  Now  it  is  time  to  say  that  we  choose 
th*  PASCAL  expression  to  be  translated  Into  POLIJH 
form  expressions,  and  that  th*  PASCAL  control  ins¬ 
tructions  will  be  translated  Into  equivalent  PASC- 
HLL  control  instructions. 

it  is  claar  that  the  PASC-HLL  program  will  have  the 
same  structure  aa  the  PASCAL  program  it  ha*  been 
translated  from.  An  example  is  given  in  Fig. 4  which 
shows  th*  equivalence:  PASC-HLL  offers  the  same 
structured  programming  facilities  as  PASCAL,  needing 
the  machine  to  base  its  control  structure  on  th* 
principles  of  control  segments  defined  by  a  couple: 
(Entry  Address,  Return  Address) .  Inside  a  control 
segment,  the  Program  counter  PC  is  Inert,  anted,  but 
syntactic  rules  must  be  satisfied:  a  Polish  form  ex¬ 
pression  must  be  completed  before  an  ASSIGN  Instruc¬ 
tion  is  fetched  and  an  expression  cannot  start  with 
an  operator,  an  INDEX  operator  cannot  be  applied  on 
any  operand:  its  first  operand  must  be  an  ARRAY  the 
second  on*  must  be  a  SCALAR. 


IV  *  THE  PASC-HLL  8B0MEMT8 

Compiling  a  PASCAL  program  generates  a  PASCAL-HLL 
coda  segment  holding  a  Types  Descriptor  Table,  a 
Constant  Table  and,  for  each  procedure,  a  couple  of 
two  elements:  firstly  the  formal  parameters  and  lo¬ 
cal  variables  descriptors,  secondly  the  executable 
cod*.  An  slssMnt  number  is  associstsd  to  each  pro¬ 
cedure  :  it  is  considered  as  th*  "value"  of  the  pro¬ 
cedure  variable,  Inside  its  Variable  Descriptor. 

During  execution,  th*  PASC-HLL  machine  needs  to 
accass  to  other  segments:  th*  first  one,  so-called 
Context,  holds  the  nested  procedure  pointed  to  by 
the  BASE  registers,  and  the  steond  one,  so-called 
Dynamic,  is  used  for  dynamic  allocation  and  accssaad 
by  naans  of  "pointers".  Th*  Cod*  Segment  may  be  du¬ 
plicated  as  an  External  Cod*  Segment  holding  "eyetem" 
or  "library"  procedure*. 

So  the  PASC-HLL  machine  know*  four  different  segments: 
that  feature  allows  it  to  manipulate  relative  short 
addresses  which  can  be  translated  to  become  absolute 
addresses,  it  is  than  possible  to  have  truely  re¬ 
entrant  cods  and  data,  and  iamedlat  protection  bet¬ 
ween  all  th*  segments. 


L- 


-•-a*-. 


SECTION  2  -  PRESENTATION  OF  THE  PASC-HLL  COMPUTER 

I  -  INTRODUCTION 

After  the  above  presentation  of  the  PASC-HLL  lan¬ 
guage  that  was  defined  from  the  language  PASCAL, 
we  propose  a  hardware  computer  to  execute  that  ma¬ 
chine-language.  It  is  the  Pipelined  Architecture 
Slice  Computer  for  High-Level  Language  ,  so-called 
PASC-HLL  computer,  currently  built  using  AM  2900 
family  [5], 

Pipelined  architecture  is  characterised  by  a  high 
degree  of  parallel  operations  concurrently  running 
inside  the  computer  [6].  Seceral  attempts  have  bean 
made  to  build  high  performance  pipelined  computers 
[7][8]  :  their  efficiency  depends  on  the  wey  they 
are  programmed  and  requires  a  lot  of  pra-procassing 
for  generating  optimised  Code.  The  PASC-HLL  computer 
is  ahnrapterised  by  a  new  approach,  based  on  a 
natural  decomposition  of  the  work  to  be  executed, as 
explained  in  [9]:  "a  Pipeline  Polish  String  Computer". 

II  “  PIPELINED  EXECUTION  IN  PASC-HLL 

The  PASC-HLL  order  oode  ia  ayntacticaly  in  Polish 
notation,  but  execution  la  nut  made  using  e  Push¬ 
down  Stack,  but  a  FIFO  evaluation  queue.  That  struc¬ 
ture  makes  appear  that  desynchronisation  between 
instruction  fetch,  access  to  Variable  Descriptors 
and  execution  of  operators  can  be  eehiaved.  The 
first  station  in  pipaline,  so-called  processor  PINS, 
fetches  the  next  instruction  from  Hein  Memory  i  it 
executes  the  Control  Functions  (Loop  control,  condi¬ 
tional  branoh,  procedure  oell  and  return  ...),  and 
sends  Access  Instructions  to  an  Access  station,  so- 
called  prooeaaor  FAC,  end  Operators  to  an  Operating 
Station,  so-called  processor  POP.  Internal  instruc¬ 
tions  ssnt  by  PINS  to  FAC  or  POP  go  through  FIFO 
instruction  quauas.  Variable  Descriptors  fetched 
by  PAC  are  sent  inside  the  FIFO  evaluation  queue 
managed  by  another  processor,  so-called  Local 
storage  Prooasaor  LBP. 

so,  several  PASC-HLL  instructions  are  concurrently 
in  different  atape  of  prooeesing,  either  in  Fetch, 
or  Access,  or  Operation.  Horeovar,  processor  LSP 
manages  e  FIFO  Dependency  Queue >  which  allows  to 
solve  the  delicate  problem  of  accessing  a  Value  which 
is  not  yet  modified,  but  is  known  to  be  modified 
just  later.  That  qusua  is  mads  up  of  sixteen  12-bit 
word  Content  Addressable  Memory  that  holda  the  In¬ 
ternal  Hamas  of  the  variables  which  are  to  be  modi¬ 
fied  or  have  just  bean  aaaignad:  it  holda  the  Working 
Set  of  the  program,  achieving  good  performance  for 
data  aacass  end  reducing  the  amount  of  Memory  Acces¬ 
ses.  If  the  Working  Set  la  leas  than  16  variablas 
(or  parameter a) ,  no  memory  access  ia  needed  except 
for  assignments)  all  the  Variable  Descriptors  are 
inside  the  CAM.  Hardware  design  of  the  PASC-HLL 
computer  is  now  completed;  FIVE  Independent  proces¬ 
sors,  rsaliaed  using  bipolar  4-bit  siloes,  are  con¬ 
trolled  by  FIVE  microprograms  (the  total  size  is 
32  Kbits),  with  a  cyola  time  e^ual  to  ISO  nanoseconds. 

The  PASC-HLL  computer  is  designed  to  be  ineerted  in 
a  large  scale  computing  center,  as  a  apecialisad 
CPU  connected  to  e  "Host"  computer  Main  Memory. 

The  Memory  Interface  Processor  inside  PASC-HLL  trans¬ 
lates  virtual  addresses  sent  by  the  PASC-HLL  internal 
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stations  (PINS,  PAC  and  POP)  into  absolute  memory 
addresses  according  to  the  Memory  allocation  made 
by  the  "host"  system. 

The  PASC-HLL  computer  structure  is  given  by  Fig. 5. 

ill  -  THE  DEPENDENCY  PROBLEM 

Suppose,  for  example,  that  the  high-level  Instruc¬ 
tion  “X  *■  <expreesion>"  has  been  prepared  in  both 
PAC  and  POP  instruction  queues,  or  ia  in  the  process 
of  execution  by  processor  POP,  when  an  Instruction 
of  accass  to  tha  variabla  X  antars  tha  access  pro¬ 
cessor  PAC.  That  processor  (PAC)  must  be  able  to 
detect  the  fact  that  both  "X  +•  <expression>  “  and 
"Access  X"  instructions  refer  to  the  same  variable 
X,  since  the  "Accass  to  X"  instruction  might  have 
to  be  differred  until  the  completion  of  the 
"X-*-<expreasion>"  instruction,  otherwise  ths  "Access 
X"  would  not  fetch  tha  correct  value  of  that  varia¬ 
ble,  but  ita  old  valua. 

The  proposed  f  elution  uses  a  Contant  Addressable 
Memory,  organized  in  a  FI-FO  mods,  which  holds  the 
names  of  the  variables  whose  modification  is  expec¬ 
ted  (access  to  their  value  must  be  deferred) ,  and 
tha  names  of  ths  variablas  which  have  just  been  mo¬ 
dified. 

When  an  instruction  "X-«expression>  "  is  fetched  by 
tha  access  processor  PAC,  a  Dependency  Descriptor 
is  pushed  into  ths  Dependency  Queue.  That  Descrip¬ 
tor  holds  tha  internal  name  of  tha  variable  X  (l.e. 
a  Lexical  Laval  and  an  offset) .  From  that  time , 
further  references  to  variables  ara  processed  through 
the  Contant  Addressable  Dependency  Queue,  end  tha 
name  of  the  variable  X  is  known  to  "match"  with  a 
Dpendancy  Descriptor.  In  tha  same  time,  processor  v 
POP  might  have  completed  the  execution  of  tha 
"X«-<  express  ion*"  instruction.  So  processor  PAC  can 
find  either  the  new  value  of  the  variable  (just  sto¬ 
red  by  processor  POP),  or  a  Dependency  Descriptor. 

The  second  cess  ia  procaaaed  as  the  creation  of  an 
Defarrad  Descriptor.  As  several  deferred  accesses 
to  tha  same  variable  may  occur,  all  the  Deferred 
Access  Descriptor*  related  to  that  variable  are 
linked  together,  and  they  ere  replaced  by  the  new 
value  of  the  variabla  on  the  completion  of  the  ex¬ 
pected  assignment  instruction. 

In  the  example  illustratedby  Fig. 6,  processor  POP 
has  just  completed  the  modification  of  variable  C, 
and  it  Is  currsntly  evaluating  tha  expression  to  be 
assigned  to  variabla  D.  In  the  same  time,  processor 
PAC  has  created  a  Dependency  Descriptor  for  variable 
D  and  it  has  fetched  three  "Access  D"  instructions 
which  has  bean  processed  as  three  Deferred  Access 
Descriptors,  sines  the  new  value  of  d  is  not  yet 
evaluated.  After  Completion  of  the  modification  of 
variable  D,  the  state  of  the  queue  will  be  as  Illus¬ 
trated  by  Fig. 7. 

Using  the  above  mechanism,  an  "Access  x"  instruction 
can  be  deferred  until  the  completion  of  the 
"X-*-<expression>"  instruction  (assignment).  A  Defer¬ 
red  Access  Descriptor  is  pushed  into  the  evaluation 
queue.  All  the  Deferred  Accaas  Descriptors  are  lin¬ 
ked  together,  eliminating  a  number  of  memory  refe¬ 
rences  equal  to  tha  number  of  linked  Descriptors. 
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IV  -  THE  CONDITIONAL  BRANCH  PROBLEM 

Both  PAC  and  POP  processors  say  bs  com  Ids  rad  as 
SLAVES  of  the  PINS  processor,  in  that  they  only  exe¬ 
cute  the  internal  instructions  that  they  receive 
from  the  PINS  processor,  which  is  thus  considered 
as  the  MASTER  of  the  control.  However,  when  a  con¬ 
ditional  branch  occurs,  the  PINS  processor  is  not 
able  to  choose  the  right  next  instruction,  since  the 
conditional  expression  is  being  evaluated  at  the 
same  time,  but  it  can  choose  one  instruction  among 
all  the  possible  next  instructions  (generally  two) . 

The  probability  for  a  WRONG  choice  strongly  depends 
on  the  context  in  which  the  conditionsl  branch  occurs i 
it  is  much  lower  for  the  end  of  a  loop  than  for  a 
classical  ir-TKEH-ELSE  statement.  Given  that  the 
different  high-level  conditional  statements  are  dis¬ 
tinguished  by  different  PASC-HLL  instructions,  the 
PINS  processor  knows  the  context  and  can  choose  the 
more  probable  next  instruction.  When  a  choice  has 
been  made,  we  say  that  the  PINS  processor  enters  a 
Conditional  State,  characterized  by  the  fact  that 
its  activity  is  limited  to  a  "preparation"  work,  in 
particular,  if  a  conditional  branch  is  fetched  again 
during  this  conditional  state,  no  choice  is  made, 
but  the  PINS  processor  stops  ans  waits  for  the  reso¬ 
lution  of  the  first  conditional  branch. 

When  the  value  of  the  conditional  expression  becomes 
available  in  the  POP  processor,  the  PINS  processor 
knows  whether  its  choice  was  wrong  or  not. 

In  the  case  when  the  choice  was  right,  all  processors 
can  go  on  without  any  modification.  In  the  other 
case,  all  the  prepared  work  Mat  be  disabled  i  this 
is. achieved  by  emptying  the  input  instruction  queues 
of  both  PAC  and  POP  processors  which  hold  wrong  ins¬ 
tructions,  and  by  updating  both  evaluation  and  de¬ 
pendency  queues  in  which  the  sequences  of  wrong  ope¬ 
rands  or  wrong  deferred  variables  must  be  delated: 
this  work  is  achieved  by  processor  LSP. 

V  -  HOW  TO  SAVE  THE  EVALUATION  CONTEXT 

The  evaluation  context,  represented  by  the  interme¬ 
diate  state  of  the  evaluation  queue,  must  be  saved 
when  a  "function  call"  occurs  within  an  expression. 
When  the  "CALL  instruction"  is  fetched  by  the  PINS 
processor,  a  special  order  is  sent  to  the  POP  ins¬ 
truction  queue.  Since  several  function  calls  can  be 
nested,  a  SAVE  area  is  allocated  on  the  top  of  a 
push  -  down  stack  managed  by  POP.  Then,  processor 
PINS  which  knows  the  current  state  of  the  evaluation 
queue,  generates  a  sequence  of  orders  towards  the 
POP  instruction  queue.  Thus,  the  current  state  of 
the  evaluation  queue  is  saved  by  both  POP  and  LSP 
processors  before  the  function  is  entered,  all  pre¬ 
vious  intermediate  results  being  compacted  into  the 
save  area. 

When  the  function  is  returned,  processor  POP  is  able 
to  restore  values,  and  the  evaluation  process  goes 
again. 

CONCLUSION 

This  paper  has  briefly  presented  both  machine  lan¬ 
guage  and  computer  structure  of  PASC-HLL  computer. 

It  could  be  necessary  to  mention  that  pipelined  exe¬ 
cution  of  Polish  String  was  already  described  in  [9] 
and  is  not  explained  again  here.  That  design  shows 


a  new  kind  of  machine-language,  very  compact  (up  to 
4  times  sore  coa^act  than  a  classical  machine-lan¬ 
guage)  and  very  near  the  High  Level  Language  bringing 
a  new  kind  of  software  reliability  during  execution. 
The  PASC-HLL  pipelined  architecture  is  potentially 
capable  of  high  performance,  since  five  microinstruc¬ 
tions  are  executed  each  cycle. 

Its  global  performance  is  equal  to  the  number  of 
memory  ecceeeee  which  are  independently  made  by 
three  independent  processors  inside  the  oomputer, 
allowing  en  optimum  use  of  the  Mein  Memory.  More¬ 
over  there  is  e  strong  reletion  between  HLL  program¬ 
ming  end  hardware  processing  which  works  with  the 
prograamMr -defined  variables  in  the  programmer-de¬ 
fined  control  environment. 
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HT  value 

Hardware  Type 

HT  value 

Hardware  Type 

0 

8-bit  integer 

8 

Character 

1 

16-bit  " 

9 

Char.  String 

2 

32-bit  “ 

A 

32-bit  real 

3 

64-bit  “ 

B 

64-bit  * 

4 

8-bit  powersat 

c 

Boolean 

5 

16-bit  " 

D 

Bool.  String 

6 

32-bit  “ 

E 

Array 

7 

variable  length 
power aet 

V 

Record 

Table  1  i  the  PASC-HLL  Hwdwri  Types 


Tag  bits 


I 

P 

s 

comment 

0 

0 

,  . 

value  in  SV  field 

0 

1 

01 

pointer  value  in  SV  field 

1 

0 

00 

indirect  value  in  C0NTBXT 

1 

0 

01 

"  *'  in  DYNAMIC 

1 

0 

10 

"  "  in  MAIN  CODE  (constant) 

1 

0 

11 

“  “  in  EXT.  CODE  M 

1 

1 

00 

indirect  pointer  value  in  CONTEXT 

1 

1 

01 

“  “  "  in  DYNAMIC 

Tabia  2  i  Segment  Addressing  modes 


Instruction 

meaning 

REF  (11  .d) 
INDD(U,d) 
ASSIGN (11 ,d) 
CALL (11 , d) 

CALF (11 ,d) 

PARAM  (SP .  -n) 

CLRV(ll.d) 

SETV(ll,d) 

INCV(ll,d) 

DECVUl.d) 

access  a  Variable  Descriptor 
build  an  Indirect  Variable  Descriptor 
assign  a  value  to  a  Variable 
accass  a  Procedure  Descriptor 
access  a  Function  Descriptor 
assign  a  value  to  a  Parameter 
clear  a  Variable  (11*0) 
set  a  Variable  (Bi*true) 
increment  a  variable  ( I « —1+ 1 ) 
decrement  a  Variable  (Ii«X-U 

Table  3  !  the  PASC-HLL  access  instructions 
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Encoded  Internal  nan 


Comment 


9  S  7  6  5  4  3  2 


0  0 


D 


0  1  0 
0  1  1 
1  0  0 
1  0  1 
1  1  0 
111  0| 
TTTTf 


d 

d 

d 

d 

_d 

n 

n 


D-0..255  for  Global  Variables 

d"0..63  for  local  Variables 

d«-64..-2  for  Parameters 

Special  WITH  addressing 
Special  SP  relative  addressing 


Fla- 3  ■  The  PASC-HLL  Variable  Internal  Name  encoding  (10-bit) 


PASCAL  structure 

PASC-HLL  structure 

X  i-  T (1+1 ) -1  i 

KEF(T),  REF(I),  INC ,  INDEX,  DEC,  ASSIGN(X)  | 

while  exp  do  stat  ; 

LOOP ({), exp, WHILE, s tat, ENDLOOP  |  4 

for  I  i»  expl  to  exp2  do  stati 

oxpi , exp2 , FORUP ( 1)  ,  1 

ASSIGN (I) ,stat,ROFUP| * 

case  exp  of  0,1: stati: 

3,4: stat2 i 

else : stat3 

and) 

exp, CASE  (j)  , 

LIT (0) ,LIT(1) ,0F (f ) , Stati ,FO, 

IlIT(3)  ,LIT(4)  ,OF (f  )  , Stat2  ,FO , 

^stat3,FO:, 

if  exp  then  statement: 

exp,  IF  ( |)  ,  statement, FI  i+ 

PROCNAME iexpl ,exp2)  : 

C Aid,  (PROCNAME)  ,expl  ,PARAM(SP,-3)  , 

eltp2  ,PARAM  (SP,-4 )  , ENTER: 

SET  (3,1. .J) : 

LIT  (0) ,LIT(3) .ADDELEM, 

REF (I),  REF(J),  SUBSET,  UNION, 

ASSIGN (SET) : 

Fig. 4  -  Equivalence  between  PASCAL  and  PASC-HLL  structures 
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Jute  modified  variables 


Deferred  access  variables 
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Abstrac t  code.  The  MCODE  improvements  are  based  on 

our  analysis  of  5,000  Pascal  procedures 
with  over  160,000  lines  of  program  text. 

MCODE  is  a  high-level  language,  stack  The  next  section  gives  a  brief  EM-1 

machine  designed  to  support  strongly-typed,  description  which  is  followed  by  a  discus- 

Pascal-based  languages  with  a  variety  of  sion  of  the  MCODE  improvements.  AIbo,  the 

data  types.  The  instruction  set  is  con-  instructions  used  for  expressions  and  Modu- 

structed  for  efficiency  and  extensibility  la  statements  are  illustrated.  Finally, 

and  is  based  on  an  examination  of  common  some  comparisons  are  drawn  with  respect  to 

programming  language  operations.  The  ar-  other  current  architectures,  including  the 

chitecture  provides  programmed  control  over  the  PDP-11 [1]  and  VAX[2). 
both  operand  type  selection  and  address 

field  widths.  In  addition,  right  operand  2.  Backg round 

addressing  is  included  to  improve  the  size 

characteristics  of  MCODE  instructions  over  Tanenbaum  designed  the  EM-1 (10)  to  op- 

those  of  traditional  stack  machines.  The  tlmize  the  most  frequently  occurring  high- 

design  is  compared  for  efficiency  with  the  level  operations  in  programs  as  analyzed  by 

instruction  sets  of  the  EM-1,  Digital  himself,  Knuth[B),  Alexander  and  Wort- 
Equipment  PDP-11  and  VAX-11/780.  man(3],  and  Wortman[13].  The  most  effec¬ 

tive  innovations  in  tha  EM-1  ar#  encoding 
CH  Categories!  4.12,  4.22,  4.9,  6.21  references  to  the  first  12  bytea  of  local 

procedure  storage  and  8  bytes  of  static 
Keywords  and  Phrases:  stack  machine,  com-  storage  as  single  opcodes,  array  element 

puter  architecture,  addressing  modes.  accessing,  and  "if"  statement  comparison 

and  branching.  The  hypothesis  is  that 
smaller  code  sizes  will  enhance  faster  pro- 
1.  Introduction  gram  execution  by  better  utilizing  the 

bandwidth  of  CPU  data  paths.  In  addition, 
With  the  growing  use  of  high-level  as  the  machine  gets  closer  to  tha  source 

languages  for  systems  and  applications  pro-  language,  compilers  can  produce  mors  effi- 

grammlng,  computer  inatruction  set  design  cient  code  and  can  eliminate  space- 

has  moved  from  bit  selection  of  internal  consuming  peephole  optimization  routines. 
CPU  data  paths  to  Instruction  sets  which  Another  Important  aspect  of  tha  EM-1 

are  oriented  to  common  high-level  language  design  is  the  notion  of  giving  the  program- 

operations.  Tanenbaum [ 10 ]  discusses  a  mer  code  improvement  tools  which  are 

stack  machine (EM-1 )  designed  with  this  phi-  machine  independent.  In  Knuth's  Fortran 
losophy.  The  EM-1  Is  Intended  to  directly  analysls[8],  he  strongly  suggested  that 

execute  the  code  produced  by  the  SAL  com-  program  execution  histories  be  automatical- 

piler.  SAL  is  a  typeless  systems  program-  ly  generated  for  each  job.  With 

ming  language  similar  to  BCPL[9].  In  this  Tanenbaum1 s  machine  organization,  the  pro¬ 
paper,  we  have  extended  the  EM-1  to  provide  gramraer  need  only  declare  the  most  fre- 

an  instruction  set  for  a  Pascal-based,  quently  used  variables  first  in  textual 

strongly-typed,  systems  programming  order  to  effect  a  performance  improvement, 
language,  Modula[12],  which  was  designed  by 

Wirth  and  implemented  by  Cook[6J.  our  3.  Extensions 

Module  machine  code,  MCODE,  not  only  pro¬ 
vides  extensible  type  operations  but  also  The  first  problem  that  wa  found  in 

maintains  the  efficiency  of  the  EM-1.  The  trying  to  use  the  EM-1  design  was  its  lack 

EM-1  was  designed  based  on  an  analysis  of  of  a  variety  of  data  types.  Module  pro- 

300  procedures  comprising  10,000  lines  of  vides  the  user  with  character,  Boolean, 

*  long  and  short  integer,  and  floating  point 

The  author  is  partially  supported  by  U.  S.  operations.  When  the  EM-1  is  extended  to 

Army  contract  DAAG29-75-C-0024  and  National  encompass  these  operations,  the  255  opcode 

.Science  Foundation  grant  MCS-7903947.  limit  is  quickly  exceeded.  Our  solution 

was  to  introduce  modes  of  computation.  A 
mode  sets  the  CPU's  fetch  and  axacute  mi- 
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croprogram  to  adapt  to  a  particular  data 
type  such  as  floating  or  Integer.  A  col¬ 
lection  of  8-blt  opcodes  is  provided  to  set 
the  CPU  mode.  Therefore,  a  single  "+"  op¬ 
code  suffices  for  all  addition  operations 
on  any  data  type.  The  setting  of  the  mode 
can  be  thought  of  as  the  replacement  of  the 
microcode  jump  table  for  a  subset  of  the 
opcodes . 

The  mode  approach  is  based  on  our  ob¬ 
servation  that  expressions  are  usually 
comprised  of  operands  of  the  same  type) 
thus,  we  expect  that  the  space  occupied  by 
any  extra  instructions  needed  to  set  the 
mode  will  be  offset  by  the  savings  in  op¬ 
code  space.  Modes  also  provide  an  expan¬ 
sion  and  contraction  capability  for  machine 
families.  For  Instance,  all  floating  point 
operations  could  be  eliminated  to  build  mi¬ 
croprocessors  intended  for  traffic  control 
or  a  decimal  mode  could  be  added  for  com¬ 
mercial  applications.  For  many  environ¬ 
ments,  the  savings  in  microcode  space  could 
be  significant. 

Our  second  extension  was  to  provide 
direct  addressing  for  right  operands.  Ac¬ 
cording  to  all  of  the  analyses,  expressions 
tend  to  be  simple.  Tanenbaum  found,  for 
Instance,  that  31%  of  all  assignment  state¬ 
ments  had  a  single  term  for  a  right  hand 
side.  Consider  the  evaluation  of  "a+b*  on 
a  typical  stack  machine.  We  must  "push  a", 
"push  b",  "pop  b  and  add",  and  "replace  a 
with  result".  The  alternative  is  to  "push 
a",  "add  b“ ,  and  "replace  a  with  result". 
This  sequence  not  only  saves  an  instruction 
fetch  but  also  the  redundant  push  and  pop 
of  "b"  plus  the  Instruction  space.  These 
savings  will  be  replicated  for  every  term 
in  any  expression  which  can  be  evaluated 
from  left  to  right. 

Finally,  we  have  extended  Tanenbaum's 
single  byte  addressing  modes,  provided  an 
option  to  shorten  address  fields,  Improved 
subscripting,  record  and  pointer  referenc¬ 
ing,  and  Introduced  some  additional  high- 
level  language  oriented  constructs.  In  the 
next  section,  we  will  discuss  operand  ad¬ 
dressing  . 


2**32  bytes.  The  instruction  formats  are 
designed  so  that  the  most  frequently  occur¬ 
ring  operations  require  a  minimum  of  in¬ 
struction  space. 

A  format  1  instruction  can  address  the 
first  8  16-bit  words  of  the  current 
procedure's  activation  record.  The  impact 
of  this  convention  can  be  seen  by  noting 
that  our  results  indicate  that  97%  of  all 
procedures  have  fewer  than  4  formal  parame¬ 
ters  and  90%  of  all  procedures  have  fewer 
than  4  local  variables.  Tanenbaum's  short 
address  convention  for  static  variables  was 
eliminated  since  the  size  of  the  static  ad¬ 
dress  space  is  not  known  until  load  time. 
However,  the  number  of  parameters  and  local 
variables  is  known  at  compile  time.  In  ad¬ 
dition,  our  analysis  shows  that  54%  of  all 
variable  references  were  to  local  variables 
or  parameters.  To  test  the  effect  of  this 
idea,  we  changed  all  the  local  variables  in 
the  Modula  compiler  to  C(7]  ’register" 
variables  which  decreased  each  instruction 
reference  by  16  bits.  The  compiler's  size 
decreased  by  10%  and  its  compile  rate  went 
up  sevaral  hundred  lines  per  minute. 

The  format  2  and  3  instructions  can 
have  their  operands  on  the  stack  or  can 
have  a  right  operand  specification. 
Operand  addressing  is  optimized  in  a 
fashion  similar  to  that  provided  by  the 
B1700  [1 X 1 .  The  AMODE  instruction  sets  the 
address  field  width  to  8,  16,  or  32  bits 
for  references  to  either  static  or  local 
storage.  Note  that  program  counter  rela¬ 
tive  addressing  is  not  affected.  More  than 
90%  of  all  Modula  programs  can  use  an  AMODE 
which  selects  8-bit  local  and  16-bit  static 
addresses. 

As  an  example,  the  8-bit  AMODE  setting 
would  save  8  bits  per  operand  reference 
over  the  16-btt  addresses  used  in  the  PDP- 
11.  The  AMODE  setting  has  no  effect  on  in¬ 
direct  addressing  on  the  stack.  The  VAX 
implements  8-blt  address  fields  but  an  8- 
blt  selector  is  also  required  for  a  total 
of  16  bits. 

A  natural  concern,  however,  is  keeping 
AMODE  set  correctly.  Since  Module  has  no 
"go  to",  the  AMODE  bookkeeping  is  easily 


4.  Operand  Addressing  maintained  on  the  perse  stack.  Also,  the 

'procedure  call  instructions  automatically 
The  three  MCQDE  instruction  formats  save  and  restore  mode  information.  In  ad- 
are  illustrated  below;  dition,  the  linkace  editor  is  responsible 

for  checking  address  field  overflow  if  too 

FORMAT  1:  small  an  AMODE  is  being  used.  MCODE  imple- 

FORM  2,3,3  0, opcode, local  address  ments  the  following  addressing  forms: 

FORMAT  2;  A  operands  on  the  stack 

FORM  8  opcode  [operands)  B  [static  I  local ) x{direct|  indirect) 

C  local  direct 

FORMAT  3:  D  indirect  address  on  the  stack 

FORM  8,8  255, opcode  [operands)  E  32-bit  absolute  address 

F  constants,  16,  32  bits) 

In  MCODE,  addressing  is  partitioned  G  constant(0-i 5) 

into  references  to  either  static  or  local  H  { subsc r ipt I  element)  x  B 

procedure  storage.  The  MCODE  machine  uses  subscr ipt- ( (spt) -1 ) ‘Mode  size  +  EA) 

byte  addressing  and  has  an  address  space  of  element  -((spt)+EA)  Effective  Addr. 
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i 


local  x  (direct, indirect, ind irect  x 
(  sub  sc  r  i  pt ,  e  1  em  en  t. ) ) 

8 -bit  jump  offset 
16-bit  jump  offset 


.  Forms  B  and  H  cover  accesses  to  simple 
variables,  pointers,  one  dimensional  ar¬ 
rays,  and  record  elements  occurring  in 
static  and  local  storage,  or  as  parameters. 
Subscript  addressing  assumes  a  lower  bound 
of  one  which  is  the  most  common  case.  For 
direct  addressing,  different  lower  bounds 
can  be  subtracted  from  the  address  field  to 
produce  the  correct  subscript  calculation. 
Forms  F  and  G  are  used  for  immediate  ad¬ 
dressing  while  forms  G,  J  and  K  are  used 
for  program  counter  relative  jumps  and  ab¬ 
solute  addressing.  Forms  I  and  C  are  used 
with  the  format  1(8  bit)  Instructions. 
Form  I  can  be  used  to  access  local  vari¬ 
ables,  "const"  simple  parameters, 
simple  parameters,  and  array  and 
parameters . 

Tanenbaum ( 10 )  recommends  that 
ences  to  global  procedure  variables 
plemented  by  a  microcode  search  of  the 
cedure  call  back-chains.  The  claim  is 


var 

record 

refar- 
be  im- 
pro- 
that 

this  method  eliminates  the  overhead  of 
maintaining  a  static  display.  Based  on  our 
experience  with  Implementations  of  Algol(5] 
and  Pascal[4],  a  single  reference  to  a  glo¬ 
bal  variable  uses  more  time  than  that  need¬ 
ed  to  update  the  display.  The  following 
code  sequence  is  typical. 

procedure  entry: 

CONTROLB  LOCK (SAVE ) «DI S  PLA Y [N  EST ) 

DI S  PLAY [NEST )  -PB 

procedure  exit: 

DISPLAY (NEST) -CONTROLB LOCK [SAVE ] 

The  fir3t  tan  locations  in  static 
storage  are  used  for  the  DISPLAY.  Accord¬ 
ing  to  our  study,  85%  of  all  procedures 
were  not  nested;  11%  were  nested  one  level; 
and  4%  were  nested  2  or  more  levels.  Out 
of  the  5,000  procedures  that  we  examined, 
one  was  nested  to  4  levels.  Therefore,  a 
ma/imum  of  ten  nesting  levels  was  con¬ 
sidered  sufficient.  Next,  we  will  examine 
the  format  of  the  one  byte  instructions. 

5.  Local  Variable  References 

We  followed  Tanenbaum' s  design  by  al¬ 
locating  64  opcodes  to  special  addressing. 
As  we  discussed  previously,  the  local  vari¬ 
able  address  space  was  set  at  8  16-bit 
words,  or  a  3-bit  address  field.  This  left 
1  bits  for  opcodes.  These  8  opcodes  were 
partitioned  as  follows: 


PUSH 

Form  I 

(spi)  -  (EA) 

POP 

Fo  rm  C 

(EA)  -  (spf) 

ADD 

Fo  rm  C 

( ap)  +-  (EA) 

SUB 

Form  C 

(sp)  -«  (EA) 

CMPB- 

Form  C,K 

if  (spt)« (EA)  then 

CMPBO  Form  C,K 


(pc)  +-  SE (K ) 
if  (spt)O(EA)  then 
(pc)  +•  SE  (K  ) 


The  PUSH  instruction  uses  two  opcodes 
for  direct  or  indirect  references  to  simple 
variables,  and  two  opcodes  for  indirect,  or 
"var",  references  to  arrays  and  records. 
The  number  of  addressing  modes  for  POP  was 
decreased  to  one  in  order  to  Increase  the 
number  of  opcodes.  In  addition,  we  found 
that  variable  loads  occur  in  a  2.7/1  ratio 
over  variable  assignments  which  indicates 
that  POP  la  used  less  frequently  than  PUSH. 
The  last  four  opcodes  ware  assigned  based 
on  our  frequency  of  use  information.  Out 
of  all  operator  occurrences,  was  used 

21%  of  the  time,  "-"  was  used  9%,  was 
used  20%,  and  "<>"  was  used  10%  of  the 
time.  According  to  Tanenbaum,  the  dynamic 
frequency  of  these  operators  is  even 
higher.  In  conditional  expressions,  we 
found  that  "»"  made  up  33%  of  all  operators 
and  that  "<>"  was  used  17%  of  the  tlnje. 
Since  Tanenbaum  found  that  "If",  "repeat",, 
and  "while"  had  a  dynamic  frequency  of  38%, 
the  comparisons  were  Implemented  to  both 
teat  and  jump.  Using  these  formats,  many 
subprograms  can  be  completely  coded  using 
only  8  bit  instructions. 

6.  Right  Operand  Addressing 

Because  of  the  number  of  opcodes  need¬ 
ed  for  right-operand  addressing,  we  res¬ 
tricted  the  operators  based  on  the  same 
frequency  analysis  which  was  used  to  select 
the  8-bit  instruction  set.  The  following 
table  lists  the  Instructions  which  cen  ad¬ 
dress  memory: 


PUSH 

POP 

PUSHA 

ADD 

ADDTO 

AND 

CLR 

CMPB- 

CMPBO 

DEC 

INC 

MUL 

SUB 

SUBFM 


The  selected  operators  make  up  80%  of  all 
operator  references  In  the  Pascal  programs 
that,  we  analyzed.  Address  modes  B  and  F 
were  chosen  since  35%  of  all  operand  refer¬ 
ences  were  to  simple  variables  and  36%  of 
all  operands  were  constants.  The  ADDTO  and 
SUBFM  instructions  correspond  to  Module 
statements . 

7 .  Array,  Record  and  Pointer  References 
Simple  record  references  are  treated 


Form 

A,B,D,F,H,G 

(spij  »  (EA) 

Form 

A , B , D, H 

(EA)  -  (sp+) 

Form 

B,  E,  H 

(spi)  -  EA 

Form 

A,  B ,  F 

(sp)  -  ( sp)  +  (EA) 

Form 

B 

(EA)  -  (EA)+(spt) 

Form 

A ,  B ,  F 

(sp)  -  (sp)  k  (EA) 

Form 

B 

(EA)  -  0 

Fo  rm 

A ,  B  ,  F 

If  (sp*)«(EA) 

Fo  rm 

A,B,F 

if  (spt)O(EA) 

Form 

B 

(EA)  -  (EA) -1 

Form 

B 

(EA)  -  (EA)+1 

Form 

A,B ,  F 

(sp)  -  Cap)  *  (EA) 

Form 

A ,  B  ,  F 

(sp)  -  (sp)  -  (EA) 

Form 

B 

(EA)  -  (EA)-(Spf) 

1  n 


Hi; 


.vjk  i  jjjmsi. f. 


alarms  *  aiiaatfB. 


just  like  simple  variable  references  and 
can  be  accessed  using  direct  addressing. 
However,  arrays  of  records  or  records  as 
parameters  must  be  accessed  by  an  offset 
from  a  base  address.  The  "element"  address 
mode  implements  the  pointer  or  parameter 
case . 

Our  analysis  showed  that  20%  of  all 
array  references  had  a  single  constant  sub¬ 
script  and  that  60%  of  all  subscripts  were 
a  single  variable.  The  constant  subscript 
case  resolves  to  a  variable  address  so  the 
standard  address  formats  can  be  used  to  ac¬ 
cess  the  array.  The  "subscript"  mode  was 
Introduced  to  Implement  accesses  to  one  di¬ 
mensional  arrays.  In  fact,  we  found  that 
references  to  multidimensional  arrays  made 
up  only  10%  of  all  array  references.  MCODF. 
uses  descriptors  to  Implement  the  multidi¬ 
mensional  case. 

In  the  EM-1,  every  array  has  an  array 
descriptor  cell,  an  array  descriptor  packet 
and  an  array  data  area.  This  approach 
works  fine  for  Algol  but  not  for  Pascal- 
like  languages.  First  in  Pascal,  all  ar¬ 
rays  have  static  bounds  so  a  single 
descriptor  can  be  generated  in  static 
storage.  This  approach  allows  descriptors 
to  be  shared  and  saves  stack  space  as  well 
as  setup  time.  Secondly,  Pascal  allows  ar¬ 
rays  of  arrays  and  pointers  to  arrays  which 
implies  that  the  base  address  of  an  array 
may  already  be  on  the  stack  and  not  in  a 
descriptor.  The  MCODE  SUBS  instruction 
transforms  the  subscripts  into  a  single 
byte  offset  which  can  then  be  used  by  the 
PUSH  or  POP  instructions.  The  SUBS  in¬ 
struction  also  checks  each  subscript  for 
val  idlty. 


ABSolute 

ARith.  Shift 

CONVert 

DECrement 

Divide 

DUPl icate 

increment 


LoGical  Shift 
MOD 

NEGate 

NOT 

OR 

SQuaRe 

XQR 


MCODE  also  Includes  instructions  for  moving 
and  comparing  blocks  of  storage  as  well  as 
library  call  instructions  to  implement  the 
Modula  virtual  machine  environment  and  the 
floating-point  math  routines.  In  the  next 
section,  the  code  generated  for  the  "case", 
"if"  and  "for"  statements  will  be  dis¬ 
cussed  . 


9.  Statements 


Procedure  call  and  return  are  very 
similar  to  the  EM-1,  except  for  the  display 
updating,  and  will  not  be  described.  The 
"if"  statement  is  implemented  with  the  fol¬ 
lowing  instructions: 

CoMPare  •><>=<«<> 

CoMPare  Branch  «><>=<»<> 
Branch  «0  <>0 

Branch 


As  an  example,  the  statement 

"it  a<>b  then  inc(a)  and"  would  generate 
the  following  code: 


Instructions 

Size 

PDP-11 

Size 

PUSH  a 

~~8 

a  ,b 

CMPB-  b  LI 

24 

JEQ 

LI 

16 

INC  a 

16 

INC 

a 

32 

SUBS 


descriptor  address 


I* 


TU 


The  Instruction  address  points  to  an 
array  descriptor  which  contains  the  number 
of  bounds,  bounds  pairs,  multipliers,  ele¬ 
ment  size  and  virtual  origin.  SUBS  leaves 
the  element  Index  on  the  stack.  For  in¬ 
stance,  "A[I).B(J)"  would  produce  the  fol¬ 
lowing  code. 


PUSH 

I 

SUBS 

A  desc 

PUSHA 

elemen 

PUSH 

a 

SUBS 

B  desc 

ADD 

For  most  Modula  programs,  each  array  type 
can  be  described  by  a  single  Instance  of  a 
descriptor  no  matter  how  many  variables  of 
that  tyoe  are  created.  Next,  the  expres¬ 
sion  operators  will  be  described. 

B .  Operators 

The  following  table  lists  the  MCODE 
operators  which  are  all  format  2  Instruc¬ 
tions  . 


The  syntax  and  code  generated  for  the  "for" 


statement  are  listed  bel 

ow 

for  v : *el  by  e2 

83  do  S  end 

PUSKA 

V 

PUSH 

e  1 

PUSH 

•  2 

PUSH 

e  3 

FOR 

L2 

LI  S 

ENDFOR 

LI 

L  2 

The  "case"  instructions 

ar 

e  as  follows 

CASE  constant,  offset 

CASE  constant,  constant,  offset 

CASETBL  constant,  constant 

These  three  forms  cover  the  situations  in 
which  the  "caBe"  is  distinguished  by  a  sin¬ 
gle  value,  a  range  of  valueB,  or  a  jump 
table.  Next,  we  will  analyze  the  effec¬ 

tiveness  of  MCODE  with  respect  to  other 
machine  designs. 


2  34 


nm  lujj. 


10 .  Comparison  with  Other  Machines 

The  results  in  Figure  1  extend  the 
table  in  Tanenbaum [10]  to  include  the  VAX 
and  MCODE.  Obviously,  the  special  address¬ 
ing'  and  descriptor-based  array  computations 
make  a  significant  difference.  MCODE  per¬ 
forms  better  than  the  EM-1  for  expressions 
and  parameter  referencing  and  is  as  good  in 
all  other  areas.  Hie  difference  in  the 
"it*  tests  occurs  because  the  EM-1  assumes 
a  2-bit  field  for  branch  offsets  while  we 
used  an  8  bit  field.  The  VAX  Instructions 
are  computed  using  8  bit  displacement  ad¬ 
dressing.  In  addition,  it  should  be  point¬ 
ed  out  that  the  VAX  and  MCODE  are  support¬ 
ing  many  more  data  types  than  the  PDP-11  or 
the  EM-1.  Figure  2  recomputes  the  space 
for  the  same  statements  but  with  all  the 
machines  forced  to  use  IS  bit  addressing. 

The  values  in  Figure  2  give  a  lower 
bound  on  the  performance  of  MCODE  whereas 
Figure  1  gives  an  upper  bound  on  the 
difference.  For  16-blt  addressing,  which 
would  be  used  for  references  to  static 
storage,  MCODE  is  better  in  all  categories. 
The  EM-1  is  forced  to  use  a  16-blt  opcode 
to  access  16-bit  addresses  which  results  in 
its  poor  performance.  Since  47%  of  all 
variable  references  are  to  static  storage, 
we  feel  that  this  improvement  could  have  a 
significant  Impact  on  execution  speed.  The 
VAX  is  still  quite  poor  with  respect  to 
subscripting  even  though  a  special  instruc¬ 
tion  is  available  for  that  purpose.  Also, 
the  figures  do  not  reflect  the  dynamic  ef¬ 
fect  of  the  savings  since  Tanenbaum1 s  meas¬ 
urements  indicate  that  the  Figure  1  results 
are  even  more  significant  at  runtime. 

1 1 •  Conclusions 

We  feel  that  t  *  availability  of  modes 
as  an  extension  a  nanism  for  high-level 
language  machines  can  oe  a  significant  fac¬ 
tor  In  adapting  microprocessors  to  changing 
environments.  Also,  modes  contribute  to 
space  efficiency  in  the  Instruction  set. 
The  use  of  address  mode  settings  to  reduce 
address  field  sizes  and  right  operand  ad¬ 
dressing  also  contribute  space  savings. 
The  current  version  of  Modula  produces 
PDP-11  or  VAX  code  so  ws  have  the  means  to 
compare  the  exact  statistics  on  the  static 
and  dynamic  behavior  of  MCODE  with  these 
machines  using  the  same  programs  in  the 
same  environment.  Our  analysis  should  con¬ 
tribute  to  the  alternatives  available  for 
opcode  design  in  modern  machine  families. 
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Figure  1 

Direct  Addressing  Instruction  Size(in  bits) 


Statements 

MCODE 

EM-1 

PDP-11 

VAX 

i  i-0 

16 

S 

32 

24 

i  i-3 

16 

24 

48 

32 

i:-j 

16 

16 

48 

40 

i i»i+l 

16 

8 

32 

24 

i  : *i+  j 

24 

32 

48 

40 

i:-j+k 

24 

32 

96 

56 

ii-j+1 

24 

24 

80 

48 

l«-«t  jl 

24 

32 

128 

104 

a  t i] : -0 

32 

32 

112 

88 

a  [  i)  :-b[  j] 

40 

48 

192 

168 

a[  i)  !  “b[  j)  +c  [  k] 

64 

80 

304 

248 

at  i.j  ,k]  :-0 

64 

48 

176 

200 

if  i«j  than 

32 

24 

64 

64 

if  i-0  than 

24 

16 

48 

48 

if  i-j+k  then 

40 

40 

112 

96 

if  flag  than 

24 

16 

48 

4  B 

call  p 

24 

16 

64 

32 

call  p(i)  value 

32 

24 

96 

56 

call  p(i,j) 

40 

32 

128 

80 

call  p(i)  byrcf 

40 

32 

112 

56 

for  i:»l  by  1  to  N  ^o  a [ 1 }  : 

:  -0  end 

104 

88 

1 7u 

116 
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Figure  2 

16-Blt  Address  Fields 


S  ta tements 

MCODE 

EM-1 

PDP-11 

i  :  -0 

24 

32 

32 

i  :  *3 

32 

48 

48 

i  :»j 

48 

84 

48 

i  s-i+1 

24 

32 

32 

1  !»i+  j 

48 

184 

48 

i i«  j+k 

72 

104 

96 

ii-j+1 

56 

80 

80 

1  :ra(  j) 

72 

96 

128 

all] :-0 

64 

72 

112 

all] :-bl j] 

96 

128 

192 

a [  i  1 i»b[ j]+c[kj 

152 

200 

304 

a[  i ,  j  ,k) t-0 

128 

136 

176 

if  i-j  then 

64 

96 

96 

if  i«0  then 

48 

64 

80 

if  i - j+k  then 

88 
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if  flag  then 

48 

64 

80 
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ABSTRACT 


Bell  Laboratories 
Murray  Hill.  New  Jersey  D7q74 


The  instruction  set  for  the  SYMBOf.  computer 
system  is  discussed  in  detail.  The  SYMBOL  computer 
is  a  large  scale  multiprocessor  which  implements  a 
high  level  language,  compiler,  text  editor  and  time- 
shared  operating  system  entirely  in  hardware.  The 
intent  of  the  paper  is  to  document  the  instruction  set. 
us  used  in  the  working  system  for  over  seven  years. 

Covered  are  the  internal  codes,  what  they  do.  and  the 
associated  machine  maintained  data  structures 

Introduction 

The  SYMBOL  computer  system1-2-'  is  of  great  importance  in 
the  field  of  computer  architecture  since  it  represents  a  ntujor  departure 
from  von  Neumunn  architectures  and  is  une  of  the  lew  examples  of  an 
experimental  (or  commercial)  machine  that  resulted  in  a  lull  scale- 
working  High  Level  Language  Computer  System  Although  the  high 
level  SYMBOL  Programming  latnguage  (SPL)  was  implemented  in  the 
muchine  without  the  aid  of  software,  SYMBOL  does  have  an  internal 
instruction  set  much  like  any  computer.  Unlike  most  computers,  how¬ 
ever,  the  SYMBOL  Instruction  Set  is  non-von  Neumann  and  at  a  very- 
high  level,  with  almost  a  one-to-one  mapping  k-lween  tokens  in  the 
source  code  und  instructions  in  (he  object  code.  Though  the  instruc¬ 
tion  set  is  probably  the  best  wuy  to  describe  (he  computational  abilities 
of  SYMBOL,  it  has  been  one  of  the  least  documented  features.  This 
paper  seeks  to  fill  this  gup  by  providing  a  detailed  description  ol  the 
instructions,  how  they  are  executed,  and  the  internal  data  structures 
used  in  executing  SYMBOL  object  programs. 

SYMBOL  Architecture  Overview 

Because  of  SYMBOL’S  unusual  machine  architecture,  a  brief 
description  of  the  system  is  in  order.  The  SYMBOL  computer  system 
is  composed  of  eight  relatively  autonomous  processors:  the  System 
Supervisor,  the  Input/ Output  Processor,  the  Channel  Controller,  the 
Drum  Controller,  the  Memory  Reclaimer,  the  Memory  Controllci,  the 
Translator,  und  the  Central  Processor  The  last  three  of  these  are  of 
special  interest  for  the  purposes  of  this  paper. 

Program  execution  is  controlled  by  the  Central  Processor,  which 
is  itself  composed  of  four  sub-processors.  The  Instruction  Sequencer  is 
responsible  for  fetching  instructions,  executing  some  directly  and 
delegating  the  rest  to  another  sub-processor.  The  Arithmetic  Proces¬ 
sor4  performs  traditional  arithmetic  operations  wi'h  precision  con¬ 
trolled,  decimal  arithmetic.  The  Format  Processor'  handles  character 
oriented  operations,  as  well  as  the  pueking  and  unpacking  of  numbers 
Lastly,  the  Reference  Processor  controls  all  identifier  referencing 

One  of  the  more  unusual  aspects  of  SYMBOL  is  that  the 
memory  structure  is  not  organized  as  a  contiguous  set  of  sequentially 
numbered  storage  cells.  Instead,  storage  is  viewed  by  most  of  the  pro¬ 
cessors  as  a  limitless  suppl  of  variable  length  storage  strings,  whose 
storage  cells  (machine  words)  arc  logicully  sequential  but  may  not  he 
consecutively  addressed  in  memory.  The  SYMBOl.  memory  structure 
consists  of  four  hierarchal  levels.  At  the  lowest  level  u  core  memory 
und  a  rotating  magnetic  drum  constitute  the  physical  memory.  Next  is 
a  paged  virtual  storage  system  consisting  ol  2“''  M-hit  words,  ol  which 
4(N6  2K-bytc  pages  were  implemented  1111'  Memory  Controller,  with 
a  set  of  high  level  memory  operations,  maps  virtual  storage  into  "logi¬ 
cal  storage.'*-7  At  the  highest  machine  level  arc  the  operations  which 
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operate  on  the  user  data  structures.  Throughout  the  rest  ot  this  pa  pci 
the  word  "string"  referring  to  storage  means  a  separate  and  logically 
sequential  series  of  words,  used  in  the  same  manner  as  segments  m 
other  computer  systems. 

Tht  SYMBOL  Programming  l-anguagt 

Because  the  instruction  set  of  the  SYMBOl.  machine  is  so 
directly  tied  to  the  language  it  implements,  the  reader  will  find  the  fol¬ 
lowing  sections  easier  to  understand  by  referring  to  one  of  the  many 
descriptions  of  the  language.5 .h.v.iu.ii  Basically.  SPL  is  a  general- 
purpose  procedural  block-structured  language.  In  many  ways,  it  can 
he  viewed  as  a  mixture  of  APL,  ALGOL  and  LISP.  The  language  is 
free  of  most  declarations  us  to  the  size  or  type  of  data  objects;  these 
attributes  can  vary  dynamically  during  the  life  of  a  program.  Data 
objects  arc  either  scalars  (i.e.  a  sequence  oi  characters  that  may  hap¬ 
pen  to  fit  the  definition  of  a  number.  Boolean,  or  string),  or  the  ohjec-l 
is  a  structure  whose  elements  arc  either  sculurs  or  other  structures. 
Structures  may  he  of  any  arbitrary  shape  which  is  representable  by  a 
tree,  and  may  not  he  recursively  defined.  Procedure-  pass  parameters 
via  call-hy-name,  also  known  as  cull  by  substitution.  There  arc  no 
automatic  variables,  all  variables  urc  statically  allocated.  Scoping  rales 
arc  such  that  a  variable  is  known  only  locally,  unless  explicitly  declared 
to  lx-  global.  SPL  also  has  ON  blocks,  similm  in  may  ways  to  ON 
blocks  in  PLI. 

Instruction  Set  Overview 


! 

i 


SYMBOL  instructions  are  ordered  in  reverse  Polish  notation  and 
make  use  of  an  expression  evaluation  slack.  SYMBOL  uses  both 
descriptors  and  tags  for  recording  the  attributes  and  structure  of  data. 
Descriptors  are  grouped  together  in  Name  Tables,  generated  by  the 
Translator  at  compile  time.  Type  tags  ate  associated  with  the  datu.  at 
the  beginning  of  a  data  object  the  tag  records  the  type;  a  tag  is  also 
used  to  denote  the  end  of  a  data  object.  The  basic  instruction  sel  is 
shown  in  Table  1,  with  the  internal  bit  representation  shown  in  hexa¬ 
decimal.  Throughout  this  paper  internal  codings  will  be  shown  in  hex¬ 
adecimal  unless  otherwise  indicated.  Addresses  in  SYMBOL  ate  24 
bits  long  and  address  sixty-four  bit  words  For  huulwutc  bussing  sim¬ 
plicity.  each  word  contains  a  maximum  of  two  instructions,  each  half¬ 
word  instruction  consisting  of  un  eight  hit  o|K-odc  billowed  by  a 
twenty-four  bit  address  field.  Only  six  ol  the  opcodes  require  an 
address. 


"V 


Internal  Representation  of  Bata  Values 

The  storage  format  for  scalar  character  siring  values  consists  ol  a 
String  Start  character  (F5),  followed  by  the  characters  in  the  string  in 
ASCII,  followed  by  a  String  End  character  (F6);  this  is  culled  the  data 
string  format.  Scalur  values  appear  in  the  object  string  as  a  icsult  of 
literals  in  the  source  string.  A  scalar  value  may  be  stored  in  a  Name- 
Table  if  it  is  six  or  fewer  characters  in  length  (since  there  must  lx- 
room  for  the  string,  start  und  end  characters  in  an  eight  ch.uactei 
word)  For  longer  strings,  a  memory  siring  is  alloc.iicd.  and  a  pumici 
to  this  siring  is  placed  in  the  Name  Table.  Ilovvcvci.  it  tin-  suing 
should  later  shrink  to  six  or  lewer  characteis.  u  would  icinain  in  the 
memory  string,  and  not  lx-  placed  in  the  Name  I  able 


238 


% 

v  F 

I 

1 


rfc  \  r  mistm 6  e 


Table  1.  SYMBOL  Instruction  Set 
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A  second  storage  format  is  used  for  packed  decimal  numbers; 
Ihc  numeric  field  format.  This  is  the  format  in  which  the  Arithmetic 
Processor  produces  its  results.  If  an  operand  for  an  arithmetic  opera¬ 
tion  is  in  data  string  format,  the  Format  Processor  will  automatically 
convert  the  operand  into  the  numeric  field  format  before  the  Arith¬ 
metic  Processor  proceeds  with  its  operation.  The  components  ot  ,i 
number  slored  m  numeric  field  formal  are  the  exponent  sign,  the 
mantissa  sign,  the  exponent  magnitude,  the  mantissa,  and  the  precision 
code  The  exponent  anti  mantissa  signs  appear  us  two  bits  ot  the  start 
character  (HI,  FI.  FT.  or  F.T).  The  character  following  the  start  char¬ 
acter  contains  the  exponent  magnitude  as  two  BCD  digits.  The  I  to 
digit  mantissa  is  stored  after  the  exponent  character  and  occupies  as 
many  words  its  are  required  at  ten  digits  per  word.  (The  first  two  and 
hist  bytes  of  a  word  arc  not  used  for  packed  BCD  data  so  that  mantissa 
digits  will  always  occupy  the  same  portion  of  the  word.)  Following  the 
last  mantissa  digit  is  a  four  bit  precision  code  which  indicates  that  the 
number  represented  has  either  infinite  precision  (till),  or  only  that 
precision  implied  by  the  number  of  mantiiw  digits  (111(1).  The 
representation  (or  a  true  numeric  zero  starts  with  an  F4.  The  last 
word  of  the  numeric  string  is  indicated  by  a  set.  high  order  hit  in  the 
last  byte 


Structure  values  may  appear  on  the  s'.tck  or  in  the  object  string 
m  linear  format,  or  elsewhere  in  tree  format.12  In  tree  formal,  a  struc¬ 
ture  is  stored  in  a  memory  string  as  a  succession  of  scalar  values  and 
links  to  substructures.  The  scalars  are  stored  with  start  ind  end  char¬ 
acters  as  described  above,  and  ire  aligned  on  word  boundaries.  If.  at 
a  later  time,  the  scalar  expands  and  requires  more  space,  then  addi¬ 
tional  (M  byte)  memory  groups  are  linked  (inserted)  into  the  memory 
string  A  link  lo  a  substructure  consists  of  a  single  word  beginning 
with  the  character  EC,  and  containing  an  address  pointing  lo  a 
separate  memory  string  where  the  substructure  begins.  Following  Ihc 
last  component  in  the  structure,  and  in  each  substructure,  is  the  End 
Vector  character  (F7).  A  null  vector,  the  analogue  of  the  mill  string, 
is  stored  simply  as  a  memory  string  beginning  with  an  F7. 

In  the  linear  formal,  the  structure  value  begins  with  a  Begin 
Structure  character  (FD).  and  ends  with  an  End  Structure  character 
(FF).  Between  these  two  characters  are  stoned  the  components  of  the 
first  level  of  the  structure.  Scalar  components  are  Mored  with  start  and 
end  characters,  aligned  on  word  boundaries  at  in  the  tree  format.  If 
the  component  is  a  substructure,  however,  it  is  represented  by  a  a  Start 
Vector  character  (FC).  followed  by  the  components  of  the  substructure 
(which  may  be  scalars  or  structures),  followed  by  an  End  Vector 
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character  (FE).  The  start  and  end  characters  EC.  I  D  IT:',  and  IT  ul 
the  linear  format  are  the  only  essential  characters  in  the  wools  they 
liegin.  so  seven  bytes  are  always  wasted  in  these  words. 

If  an  initial  value  statement  in  SPL.  is  preceded  by  the  keyword 
SWITCH,  the  Translator  treats  the  initial  vulues  that  aie  being 
assigned  as  identifiers  for  labels.  The  Translator  stores  values  in  the 
object  string  as  it  would  for  an  ordinary  initial  value  statement,  using  a 
single  word  to  store  each  aalar  value  The  scalar  label  value  is  stored 
as  a  word  beginning  with  the  character  DO  (which  is  also  the  opcode 
for  the  Name  Table  Pointer  instruction),  followed  by  a  24-hil  address 
pointing  to  a  data  descriptor  for  the  label  in  a  Nume  Table,  followed 
with  the  character  flfi.  l.abel  values  may  be  "moved  around"  like  othei 
values  (e  g.  assigned  to  variables,  passed  us  procedure  arguments, 
returned  as  function  values,  etc  ).  And  of  course,  a  label  value  may 
lie  the  operand  of  a  Go  To  instruction.  Label  values  cannot  appear  as 
operands  for  uny  arithmetic,  string  or  Boolean  operations. 


The  Evaluation  Stack 

Tor  each  aclive  instance  of  a  block,  the  Ccntrul  Processor  main¬ 
tains  a  stack  to  be  used  for  evuluuling  expressions,  procedure  culling, 
and  passing  information  to  other  processors.  (Whenever  possible,  the 
Central  Processor  keeps  the  top  word  of  the  stack  in  un  internal  regis¬ 
ter  )  Each  stack  Is  a  unique  memory  siring  uud  is  created  when  a  block 
is  entered,  and  deleted  when  the  block  is  exited. 

The  first  three  words  of  the  stuck  ure  a  save  urea  for  the  block. 
When  the  stuck  is  created,  a  pointer  to  the  block's  Nume  Table,  and  a 
pointer  to  the  calling  block's  stack  are  stored  in  the  first  word  of  the 
stack.  If  this  block  should  call  another  block  (explicitly  by  u  procedure 
call  or  implicitly  by  an  ON  reference),  then  a  pointer  to  the  start  of 
the  culled  block's  object  string  and  the  contents  of  the  status  register 
ure  stored  in  the  second  word.  Alto,  the  return  point  and  top  of  slack 
pointer  ure  stored  in  the  third  word  ai  the  Mack. 

The  remainder  of  the  Mack  it  for  expression  evaluation.  If  an 
operand  is  being  pushed  on  the  Mack  and  it  is  one  word  or  less  in 
length,  it  is  copied  ditoctly  to  the  Mack.  Otherwise,  it  is  left  in  place, 
and  a  pointer  (link)  word  is  stacked.  (The  two  exceptions  to  this  rule 
are  operunds  for  the  Output  operation,  and  values  for  the  assignment 
operations.) 

Link  words  begin  witli  a  character  indicating  the  nature  of  the 
operand:  El)  (simple  variable  or  value).  F.l  (structure).  E.1  (lube!).  E4 
(scalar  value  in  Name  Table),  BJ  (memory  string  containing  sulv 
scri|ited  variable  reference).  E6  (scalar  value  that  won't  fit  In  one 
word),  E8  (scalar  or  Mructure  valued  component  of  u  structure),  EA 
(IN  reference  to  simple  variable),  EB  (IN  reference  to  structure),  or 
EE  (IN  reference  to  variable  with  value  in  Name  Tattle).  The  code 
generated  for  an  IN  expreaaion  it  the  IN  instruction  (H8),  followed  by 
a  subscripted  variable  reference.  The  result  of  the  expression  is  a 
Boolean  value  indicating  whether  the  Indicated  component  of  the  vari¬ 
able  exists  The  left  addreat  field  of  the  link  word  is  used  when  point¬ 
ing  at  the  data  value  and  the  right  address  field  is  used  when  pointing 
to  the  descriptor  for  the  value. 

The  Colon  (BA),  lnugeriae  (DA),  and  Perform  Subscription 
(DD)  instructions  are  related  to  structure  references.  Expressions  for 
subscripts  are  evaluated  (uang  the  stack  for  intermediate  values)  and 
are  then  converted  to  a  four  digit  integer  Each  subscript  is  stored  on 
the  stack  in  a  word  beginning  with  BA  or  DA,  which  indicates  the 
type  of  subscript.  DD  (Perform  Subscription  operation)  follows  the 
last  subscript.  The  character  subMring  operation  it  handled  as  a  torm 
of  subscription.  (For  example.  In  SPL  x|l.2:3|  is  the  3  character 
string  from  the  first  component  ot  x.  beginning  at  character  position 
2.)  The  subacript  preceding  the  colon  is  stacked  in  a  word  beginning 
with  the  character  BA  (Cokm  operator)  All  the  remaining  subscripts 
arc  stacked  In  words  beginning  with  DA  (Intcgcri/e  operation)  Alter 
the  Perform  Subscription  operator  bus  hecn  stacked,  the  link  to  the 


variable  being  subscripted  and  the  suhsenpt  list  ate  moved  to  a  new 
memory  string  and  a  link  wotd  (E5)  is  placed  on  the  stack  pointing  to 
the  siring.  The  subscripted  relerenee  is  not  evaluated  until  it  is  abso¬ 
lutely  necessary  to  have  the  value  or  location  indicated 

Numr  Tables 

The  Translator  produces  a  Name  Table  lor  each  block  (main 
program,  procedure  or  ON  block )  in  the  source  string  All  reletenees 
made  in  that  block  to  labels,  procedures,  or  variables  are  made 
through  the  descriptors  in  that  block's  Name  I  able  Figures  I  shows 
the  organization  of  the  Name  Table  und  figure  2  shows  the  organiza¬ 
tion  of  Ihe  control  words.  The  first  word  of  a  Name  I  able  is  called 
the  Block  Control  Word  It  contains  two  address  fields  which  are  used 
to  link  all  the  Name  'Tables  lot  a  program  The  lost  address  held  is 
used  to  forward  link  all  the  Name  Tables  togethei  in  a  single  In. 
beginning  with  the  Nume  Table  for  the  main  prnginm  The  second 
address  field  is  used  as  a  pointer  to  the  Name  Table  lot  the  .statically 
enclosing  block.  (  The  Block  Control  Word  lor  the  Name  Table  ol  the 
main  program  which  is  the  outermost  block,  has  no  such  pointer  )  The 
actual  bit  definitions  ure  shown  in  Tabic  2. 

The  Block  Control  Word  contain!  a  hit  indicating  block  in  use. 
and  a  bit  indicating  block  reclined.  When  a  block  is  entered,  the 
hlnck-m-usc'hit  is  set.  mid  the  hit  is  cleared  when  the  block  is  lull  IT 
a  hlix'k  is  (centered  (i.e.  entered  when  the  hloek-m-use  hit  is  ahead) 
sot),  ihe  Central  Processor  calls  on  u  software  routine  to  petlottn  a 
"fixup”  to  handle  recursion.1'  (The  hardware  was  not  specifically 
designed  to  bundle  recursion.)  This  software  must  create  a  copy  ol  Ihe 
block's  Name  Table,  mndily  Ihe  original  Name  Table  lo  initialize  local 
variables,  and  then  set  the  Nock-recurscd  bit  Similarly,  il  a  block  is 
left  with  the  block-rccursed  bit  set,  another  software  loiitilic  is  called 
that  must  undo  the  work  of  Ihe  first  routine 

Following  Ihe  Blrx'k  Control  Word  is  a  succession  of  entries  lot 
each  identifier  in  the  block.  Each  entry  consists  ol  lire  ASCII  name  ol 
the  Identifier,  taking  us  many  eight-byte  words  us  necessary  und  pud¬ 
ding  with  nulls,  and  followed  by  a  one  word  darn  descriptor  for  the 
identifier  called  the  Identifier  Control  Word  (ILX'W).  Tublc  3  shows, 
the  bit  layout  of  the  IDCW. 

If  the  identifier  is  a  local  variable,  it  is  so  lugged  in  the  IIX'W. 
There  are  alto  Dug  bits  lo  indicate  whether  the  variable  is  scalar 
valued,  structure  valued,  or  has  not  yet  been  assigned  a  value.  If  the 
value  it  undefined,  then  the  first  time  that  the  variable  is  accessed,  it  Is 
given  a  null,  scalar  vulue.  which  is  interpreted  us  u  zero  lor  arithmetic 
operations.  If  a  value  is  defined,  then  that  vulue  may  uppear  lit  the 
object  string,  the  Name  Table  (scalars  only),  or  elsewhere  in  memory. 

If  an  initial  value  statement  occurs  in  the  source  siring,  the 
Translator  will  place  a  pointer  lo  the  object  string  locution  ol  Ihe  vulue 
in  the  IDCW  for  the  variable.  When  the  variable  is  first  accessed,  ihe 
value  is  copied  from  the  object  Mring.  (Note  that  lot  structures  this 
means  converting  from  lit>  ir  to  tree  formal.)  A  pointer  to  the  vulue 
replaces  the  old  pointer  in  the  IDCW.  and  the  l>it  indicating  data  in 
ob|ccl  string  is  cleared. 

For  data  in  Name  Tables,  recall  that  scalars  (excluding  laltcl 
values)  are  stored  with  a  Mart  character  which  begins  with  hes  "F“. 
This  half  byte  at  Ihe  I >c ginning  of  an  IIX'W  is  used  to  indicate  data  in 
Name  Table  (i.e.  Ihe  IDCW  is  the  scalar  value  itself).  If  the  value  is 
elsewhere  in  memory,  the  IDCW  contains  a  pointer  to  the  memory 
Mring  where  the  value  begins. 

Global  variables  are  variables  that  are  known  in  outer  blixks. 
SYMBOL  permits  the  neMlng  of  blocks  as  does  PCI.  In  contrust  to 
PUI  however,  variables  are  local  unless  declared  global  in  an  SPL 
Global  statement.  The  IDCW  contains  a  hit  indicating  if  the  variuble 
is  global,  and  a  pointer  to  the  IDCW  for  the  vuriuhle  in  the  enclosing 
Mock.  There  may  in  general  be  many  levels  of  such  indirection  If 
the  identifier  is  for  a  procedure,  the  global  variable  and  procedure  hits 
of  ihe  IIX'W  will  lie  set  Procedure*  arc  an  exception  to  the  local 
unless  declared  global  principle  A  proccdtiic's  nope  always  extends 
to  inner  blocks.  The  IIX'W  will  also  contain  a  pmnici  to  the  Imalimi 
in  oh|ccl  stung  where  the  procedure  begins  In  SI’I  .  a  lalx'l  is  always 
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Figure  I.  Name  lalilr  Formal 
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local  In  die  block  wlieie  o  nceuis  However,  the  scope  of  the  label 
m.iv  he  extended  lo  an  inner  block  il  that  block  contains  a  Global 
statement  naming  die  label.  I  bus  a  (in  To  eun  be  used  lo  jump  oul  of 
a  blmk.  Ihe  IIX'W  loi  a  label  contains  a  pointer  lo  the  location  in 
die  ohiecl  tiling  vtlicic  cxcculinn  it  to  continue. 

l  or  ptneedmes.  the  lust  entries  in  live  Name  Table  will  he  Ihe 
Ininml  parameters  (d  anyl.  simply  Iveeailse  they  are  the  first  identifiers 
eneounleied  by  (he  I  ranslntm  when  scanning  Ihe  source  siring.  SYM- 
BOI  implements  call-by-name  for  all  parameters.  When  a  procedure 
is  called,  the  formal  parameters  are  linked.  Two  mechanisms  exist  in 
SYMBOL  for  linking  parameters.  The  most  general  method  (indirect 
parameter  I  it  lo  compile  code  near  the  calling  point  in  the  object  siring 
lo  evaluate  the  actual  parameter  (commonly  known  as  a  thunk). 
When  the  procedure  is  called,  a  pointer  to  the  code  for  the  actual 
parameter  is  placed  into  the  IDCW  for  the  formal  parameter.  When¬ 
ever  the  formal  parameter  is  referenced,  this  code  is  executed  and  the 
ucttitil  parameter  it  left  on  the  top  of  the  stack.  (Part  of  the  fixup  for 
recursion  requires  that  a  mixiified  copy  of  this  code  he  generated, 
since  it  will  in  general,  contain  absolute  address  references  to  the  origi¬ 
nal  Name  Table.)  Often,  however,  the  actual  parameter  is  a  simple 
variable.14  In  the  second  mechanism  (direct  parameter),  when  the  pro¬ 
cedure  it  culled,  the  IDCW  of  the  formal  parameter  is  set  up  as  if  il 
were  a  global  variable  with  a  pointer  to  the  IDCW  of  the  actual 
parameter  The  Translator  determines  which  mechanism  to  use  und 
piodiiccs  the  appropriate  instructions.  Often  the  Trumlatoi  chose  to 
eontpde  an  indirect  parameter  where  a  direct  purnmetcr  would  suffice 
because  it  was  too  stupid. 

SYMBOL  provides  a  mechanism  for  trapping  references  to  vari¬ 
ables.  procedures  and  labels,  called  the  ON  Mock.  The  IDCW  for  an 
ulcntiftci  which  bus  an 'ON  block  contains  a  pointer  to  the  object  string 
lor  that  ON  Mock  and  a  bit  indicating  whether  or  not  the  option  is  in 
effect  litis  bit.  which  is  initially  set.  may  he  cleared  by  a  Disable 
instruction  (H2)  or  set  by  an  Enable  instruction  (HI).  If  the  identifier 
it  ,t  variable  name,  the  ON  block  will  be  invoked  immediately  aftei  an 
assignment  to  that  variable  occurs.  If  the  identifier  is  a  label,  the  ON 
block  will  be  invoked  upon  encountering  a  Go  To  statement  to  that 
label  before  the  transfer  actually  lakes  place.  If  the  identifier  is  u  pro¬ 
cedure.  the  ON  block  will  he  Invoked  upon  encountering  a  call  to  that 
jvroccduie  before  entry  tukex  place. 

There  is  one  more  piece  of  infonnatinn  stored  in  the  IIX'W. 
Recall  that  a  vector  stored  in  memory  is  a  succession  of  arbitrarily  long 
scalar  values  placed  end-to-end.  Because  the  addresaes  of  a  component 
of  a  vector  can  not  be  calculated,  finding  the  n'th  component  would 
mean  wanning  the  preceding  it- 1  components.  One  of  the  mechanisms 
used  in  speed  up  this  search  is  culled  eurrenl  pointer.  In  the  IDCW 
lor  the  structure  is  stored  the  subscript  used  in  the  last  reference,  and 
the  address  nl  that  component  in  memory.  If  the  next  reference  is  to  a 
component  succeeding  ihe  last,  the  scurch  begins  where  the  lust  search 
left  oft  t  he  mechanism  is  somewhat  limited  because  space  for  only 
two  digits  js  provided  in  the  IIX'W  for  the  subscript. 

Object  String 

In  addition  lo  the  Name  lables.  the  Translator  produces  a  single 
mommy  string  called  ihe  object  string,  which  contains  the  axle  directly 
executed  by  Ihe  Cenlral  Processor. 

All  language  components  hute  been  .translated  into  a  post-fixed 
string  lorm  in  the  object  string.  All  variable  references  are  made 
through  the  dntu  descriptors  in  the  Name  Table.  The  object  string  is 
not  ut  all  altered  while  the  program  is  running,  and  consists  of 
ojveiunds  and  operator,.  The  operands  are  pushed  onto  «  LIFO  stack 
as  they  are  encountered.  When  an  operator  is  encountered,  it  is 
passed  to  the  appropriate  processor,  which  performs  ihe  operation  on 
the  operands  on  the  top  of  the  stack,  amt  replaces  them  with  the  result 
III  arts )  ol  the  operation 

i  aeh  vvoid  ol  object  siring  may  contain  iwo  machine  instruc¬ 
tions.  one  in  each  hall  ol  the  word,  each  composed  of  an  8-bil  opera- 
lion  code  and  a  24-hil  address  Held.  The  axles  LI*  through  EL  never 
appear  m  the  ol>|cH  slung  Ol  die  remaining  M2  instructions,  only  six 
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use  the  address  field  as  such:  Block  (90).  II  False  Then  Jump  (B5). 
Name  Table  Pointer  (DO).  Direct  Parameter  (DJ).  Indirect  Parameter 
(DM.  and  Transfer  (D7).  The  Source  Pointer  (D4)  instruction  is  gen¬ 
erated  by  the  Translator  with  an  address  in  the  address  field,  hut  since 
this  instruction  is  treated  as  a  No-op.  the  address  field  is  not  really 
required.  Some  operations  must  always  appear  in  the  same  half  of  the 
word,  so  No-op's  (00)  are  used  to  fill  out  the  word  where  necessary. 

lllocks,  Labels  and  End-of-Statement 

The  first  instruction  of  each  block  in  the  object  string  is  the 
Block  instruction  (90).  The  block  entry  mechanism  does  not  occur  as 
a  result  of  this  instruction  however,  but  as  a  result  of  a  procedure  call 
or  ON  block  reference.  The  Block  instruction  is  always  placed  in  the 
second  half  of  a  word.  The  address  field  contains  the  address  of  the 
block's  Name  Table  The  accompanying  first  halfword  contains  a  No- 
op  instruction. 

Each  block  ends  with  an  End  Block  instruction  (B7).  When  this 
instruction  is  encountered,  the  block  exit  mechanism  is  invoked:  The 
current  stack  is  deleted  and  the  calling  block's  stack  becomes  the  new 
current  stack.  From  this  stack,  the  status  register  and  program  locu¬ 
tion  counter  are  restored.  (Generally,  an  End  Block  instruction  is  pre¬ 
ceded  by  a  Return  instruction,  which  also  invokes  the  block  exit 
mechanism.)  The  stack  for  the  main  program  is  tagged  so  that  a  block 
exit  from  the  main  program  causes  a  normal  program  completion  shut¬ 
down  of  the  Central  Processor. 

A  Block  instruction  also  appears  in  the  object  string  at  each  label 
entry  point.  The  1DCW  for  that  label  contains  the  address  of  this 
Block  instruction.  Whenever  a  Block  instruction  is  encountered,  the 
contents  of  this  instruction's  address  field  is  compared  to  the  location 
of  tlte  current  Name  Table.  For  a  Go  To  within  a  block  or  for  block 
entry,  these  two  addresses  will  match,  hut  not  for  a  Go  To  across 
block  noundaries.  The  Central  Processor  presumes  that  the  Go  To  is 
directed  towards  a  block  which  directly  or  indirectly  called  the 
currently  active  block,  and  performs  block  exits  until  the  proper  block 
is  found.  If  the  target  of  the  Go  To  is  not  within  one  of  these  blocks, 
the  main  program  will  eventually  he  exited,  and  the  Central  Processor 
will  shut  down  as  if  a  normal  completion  had  occurred. 

For  each  semicolon  or  END  statement,  an  End  Statement 
instruction  (BB),  and  a  Source  Pointer  instruction  (D9)  are  placed  into 
tlte  object  string.  The  address  Field  of  the  Source  Pointer  instruction 
contains  the  address  of  the  last  word  of  the  source  statement  in  the 
source  string.  The  Central  Processor  treats  this  instruction  as  a  No-op. 
The  intended  use  of  the  Source  Pointer  instruction  was  to  facilitate 
debugging  by  linking  the  location  of  an  execution  error  to  the  offend¬ 
ing  source  statement.  The  use  of  this  facility  was  abandoned  when 
software  to  decompile  the  object  code  directly  to  source  code  was 
developed,  which  provided  precise  resolution  of  the  error  location 
within  the  source  statement  and  because  there  were  problems  inherent 
in  the  Source  Pointer  mechanism.15"1  When  the  End  Statement 
instruction  is  encountered,  the  stack  is  cleared  of  any  remaining 
operands  (simply  by  resetting  the  top-of-stack  pointer),  and  user  inter¬ 
rupts  (if  any)  are  handled. 

Scalars  and  Structures  tn  the  Object  String 

The  String  Start  code*,  Ff)  through  F5.  always  appear  in  the  first 
byte  of  a  word,  and  indicate  the  beginning  of  a  scalar  value  in  data 
string  or  numeric  field  format,  if  the  value  is  one  word  lung  (indi¬ 
cated  by  a  set,  high  order  bit  in  the  last  byte),  then  the  word  is  pushed 
onto  the  suck.  Otherwise,  a  word  beginning  with  an  Efl  and  contain¬ 
ing  the  addreaa  of  the  first  word  of  the  string  is  pushed  onto  the  stack, 
and  successive  words  of  the  abject  string  are  fetched  (and  discarded) 
until  u  word  with  a  set,  high  order  bit  in  the  last  byte  is  found.  The 
String  End  character,  F6,  may  appear  in  any  byte  of  a  word,  but  is  not 
used  in  searching  for  the  laat  word  of  a  airing. 

The  codea  rC  through  FF  have  been  deacribed  earlier  in  connec¬ 
tion  with  initial  structure  values  These  codes  may  be  used  to  con¬ 
struct  structure  values  on  the  stack  as  well  The  scalar  components  uf 
these  structures  in  the  object  string  may  be  arbitrarily  complex 


expressions.  Adjacent  scalar  components  arc  sc-pai.ucd  In  the  Field 
Maik  operator  ( IX 'i  When  a  word  beginning  tsilli  one  ol  llic  ilt.u.ie 
teis  1C  through  FT  is  encountered,  that  word  is  puslu-d  onto  the  slink 
T  he  expressions  are  evaluated  just  as  if  no  structure  operators  had  been 
encountered,  and  the  result,  or  a  link  to  it  is  lelt  on  the  slack  At  a 
later  time,  the  Reference  Processor  musl  concert  the  linear  sinictuto 
value  on  the  stack  into  tree  format. 

Name  Table  Pointer  Instruction 

The  Name  Table  Pointer  instruction  (Dll)  is  used  tor  all  refer¬ 
ences  to  variables,  labels,  and  procedures  The  address  field  ol  Hits 
instruction  points  to  an  IIX’W,  which  is  examined  by  the  Reference 
Processor  when  this  instruction  is  encountered  The  action  taken 
depends  on  what  is  found  in  the  IDCW. 

For  variables  and  labels,  a  link  word  is  pushed  onto  the  stuck 
This  word  contains  the  pointer  to  the  IDCW.  and  begins  with  a  char¬ 
acter  that  reflects  the  information  found  in  the  Name  Table  Eli  (link 
to  simple  variable),  El  (link  to  structure),  nr  E4  (link  to  simple  vari¬ 
able  with  value  stored  in  Name  Tuble).  II  the  Name  Table  Pointer 
instruction  is  preceded  by  an  IN  instruction  (UK),  tlte  link  wool  will 
begin  as  follows:  EA  (IN  reference  to  simple  viiriublel.  Eli  (IN  icfei- 
cncc  in  structure),  or  EE  (IN  reference  to  simple  variable  wiih  data 
stored  in  Name  Table).  For  a  label,  the  link  word  begins  willi  the 
character  Ed. 

A  variable  reference  may  be  followed  by  a  subscript  list  Expres¬ 
sions  to  evaluate  each  subscript  are  followed  by  un  Imcgerizc  operator 
(DA)  or  a  Colon  operator  (BA),  as  described  above  Following  this 
subscript  list  is  the  Perform  Subscription  instruction  (DD).  Actuul 
evaluation  of  this  subscripted  variable  reference  is  deferred  until  the 
value  or  location  absolutely  must  be  bound  to  continue,  at  which  time 
the  Reference  Processor  will  perform  the  subscription  A  major  change 
to  the  original  design  was  made  when  problems  associated  with  an  ear¬ 
lier  binding  were  encountered.17 

Arithmetic  Operations 

When  an  arithmetic  operator  is  encountered  in  the  object  siring, 
the  Format  Processor  first  converts  the  operands  to  numeric  held  for¬ 
mat  (if  necessary),  und  then  the  Arithmetic  Processor  (or  the  Pot  null 
Processor  tor  the  Absolute  Value,  Negate,  and  Formal  opentlots)  car¬ 
ries  out  the  operation. 

The  Add  (AB).  Subtract  (AD),  Multiply  (AA).  and  Divide 
(AF)  operators  cause  the  top  two  operands  on  the  stuck  to  be  replaced 
by  their  sum,  difference,  product,  or  quotient,  respectively,  in  numeric 
field  format.  The  value  is  either  stored  directly  on  the  stack,  or  a  link 
to  the  temporary  value  is  stored  on  the  stuck  if  the  result  contains 
more  than  nine  significant  digits. 

A  two  digit  Limit  register  places  un  upper  limit  on  the  minthct 
of  significant  digits  to  which  these  four  operations  arc  carried  out .  and 
hence,  an  upper  limit  on  the  precision  of  the  results  (  The  precision  of 
the  result  may  Ire  less  than  this  limit,  depending  on  lire  precision  ol  the 
operands.  I  This  register  may  he  read  or  written  by  software,  and  is 
treated  as  a  symbolic  variable.  The  Limit  instruction  IBC  )  causes  a 
word  lo  be  pushed  onto  the  stack  beginning  with  a  BC  This  word  is 
later  converted  to  the  two  digit  value  in  duta  siring  tormut  il  the  value 
is  being  read.  A  one  bit  Limited  flag  is  set  or  cleured  as  a  result  ot 
these  operations  depending  on  whether  or  not  the  precision  ot  the 
result  would  have  been  more  than  the  Limit  register  allowed  This 
flag  can  only  be  read  by  software.  When  the  Limited  instruction  fW) 
is  encountered,  the  value  is  pushed  onto  the  stack  as  a  'll"  or  '  in 
data  string  format,  and  the  flag  is  cleared. 

There  are  six  numeric  comparison  operators:  Ivqual  to  (BD).  Not 
Equal  to  (9D),  Greater  than  (9E),  Lei*  than  (9t).  Greater  than  ot 
Equal  lo  (9B).  and  Le*»  than  or  Equal  to  (9A).  These  operators  cause 
the  top  two  operands  on  the  stack  lo  be  replaced  by  u  "I"  or  a  "If  (in 
data  string  format)  based  on  the  outcome  ol  the  comparison  When 
numbers  of  unequal  precision  arc  compared,  the  comparison  is  carried 
our  lo  the  precision  of  the  least  preciie  operand 
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The  two  monadic  arithmetic  operators.  Absolute  Value  (UU|  and 
Negate  (DB).  simple  alter  the  sign  of  the  top  operand  on  (he  stack  us 
teqttired  The  l-mmal  o|xtalor  cmiverls  the  iiunierie  operand  second 
Itom  the  top  ol  the  stack  to  data  siring  lomtul  using  a  control  string  on 
the  top  of  tite  stack  as  a  template1'1 

Character  String  Operations 

the  character  string  operations  are  carried  out  by  (he  Format 
Processor,  which  will  also  unpack  numbers  (if  necessary)  trom  numeric 
field  formal  to  data  string  format  before  proceeding 

The  Join  t>pcriilor  IHF.)  replaces  the  two  string  operands  on  the 
top  ol  Ihe  slack  with  a  siring  formed  by  concatenating  the  operands 
I  he  Mask  operator  (HI  j  is  a  general  purpose  siring  editing  operator  l!' 

I  lie  operand  on  the  top  ol  the  stack  is  used  as  an  editing  tcmpiulc  on 
the  second  operand  The  result  replaces  the  operands  on  Ihe  stack 

I  here  are  thicc  eltaracler  siring  comparison  operators:  Ik'loic 
(KM).  Same  ( HU ) .  and  After  (HA),  As  for  the  arithmetic  comparison 
opcialors.  ihe  two  operands  ate  replaced  on  Ihe  slack  by  a  T’  or  a  "D 
im  data  string  formal)  based  on  the  outcome  of  the  comparison.  1  wo 
strings  must  be  of  equal  length,  as  well  as  contain  the  same  characters 
in  Ihe  same  order  for  the  result  of  the  Same  operator  to  yield  a  "I". 
The  Before  and  After  operators  compare  two  strings  based  on  a  special 
colliding  sequence  (null  character.  special  characters. 
AiiBbCc...XxYy/.zlll2...7Hd)  rather  than  on  the  magnitude  of  the 
internal  eight  bit  ASCII  representation  as  is  customarily  done  When 
comparing  unequal  length  strings,  the  shorter  string  is  considered  to  lx: 
padded  on  the  end  with  null  characters. 

Boolean  Operations 

I  here  ate  three  Boolean  operators  Not  (Hit).  And  (HO  and  Or 
(Hl)|  Ihe  operands  used  m  Boolean  expressions  are  character  strings 
binned  front  the  three  characters  'll".  “I",  and  the  space  character 
(which  is  ignoiedl.  The  Not  operator  replaces  the  lop  operand  on  the 
stack  with  a  string  (or  link  to  a  string)  formed  from  the  operand  by 
converting  each  'll”  to  u  "I",  each  "I"  to  a  'll"  and  removing  each  space 
character.  The  And  and  Or  operators  replace  the  top  two  stack 
operands  with  their  bitwise  conjunction  or  disjunction,  respectively 
When  these  latter  two  operators  are  used  on  unequal  length  operands 
(excluding  spaces),  the  shorter  operand  is  considered  to  be  padded  on 
the  end  with  U'V 

The  Formal  Processor  is  responsible  for  executing  Ihe  Boolean 
operations.  As  tor  Ihe  character  string  operations,  operands  will  Ik1 
converted  to  data  siring  format  if  necessary  (since,  for  example,  '"liar 
could  be  both  Ihe  bit  string  of  length  three,  and  Ihe  integer  following 
>«). 

Assignment  Operations 

there  are  (wo  assignment  operators:  I  .eft  Assign  (OF),  and 
Right  Assign  (IF).  The  tormer  assigns  the  value  indicated  by  the  top 
operand  on  the  stack  to  the  location  indicated  by  the  t<perand  second 
to  the  lop  of  the  stack;  the  latter  assigns  in  the  opposite  direction. 
Until  one  of  these  operators  is  encountered  in  the  object  string,  it  is 
not  known  whether  the  preceding  operands  are  to  be  used  for  locations 
or  values  This  is  why  pointers  to  IDCW’s  are  used  on  the  stack  for 
variables,  rathei  than  the  variable's  value  or  location.  Before  ihe 
assignment  is  curried  out.  any  links  to  values  are  converted  to  actual 
values  on  the  stuck,  even  if  this  may  require  mote  than  one  word. 
I  his  operation,  and  the  final  assignment  of  vuluc  to  locution  arc  per- 
tormod  by  the  Reference  Processor  For  a  structure  assignment  state¬ 
ment.  the  value  appears  on  the  stack  as  a  structure  in  linear  format 
The  Reference  Processor  is  responsible  for  converting  this  value  to  tree 
lorntnl  as  it  stoics  the  value 

Transfer  and  tiu  I  n  Instructions 

I  he  Transler  instruction  ( D7 j  simple  resets  the  program  locution 
counter  to  ihe  value  in  the  instruction's  address  field  Ihix  instruction 
is  gcnci.ited  lo  detour  around  code  in  Ihe  object  siting  which  is  not  to 
lie  executed  in-line  such  as  initial  data  values  internal  blocks  Fisc 


clauses,  and  code  lo  evaluate  actual  parameters.  It  is  not  generated  as 
a  result  ol  the  SPI  Go  To  statement  however. 

There  is  also  a  conditional  transfer  instruction  which  is  generated 
lor  each  SPL  If  statement,  called  the  If  False  Then  Jump  instruction 
IBS).  It  is  used  lo  jump  around  the  axle  for  the  Then  clause,  to  the 
axle  lor  the  Else  clause  (il  any)  if  the  conditional  expression  in  the  If 
slatentenl  is  false.  Preceding  this  instruction  there  will  be  an  expres¬ 
sion  which  should  result  in  a  single  Boolean  value  on  the  top  of  the 
stack.  (Anything  else  will  cause  a  processing  error  shutdown.)  This 
value  is  tested  and  if  it  is  a  "0".  then  the  program  location  counter  is 
set  to  the  value  in  the  instruction's  address  field.  Otherwise,  execution 
continues  at  Ihe  instruction  following  the  If  False  Then  Jump  instruc¬ 
tion 

A  Cio  To  instruction  I  US)  is  generated  for  each  Go  To  in  the 
source  piogrnm  Unlike  Ihe  two  jump  instructions,  il  contains  no 
address  in  its  address  field.  The  target  of  the  Go  To  is  found 
indirectly  from  the  top  operand  on  the  stack.  This  operand  may  he  a 
link  lo  a  laliel  (containing  a  pointer  to  the  IDCW  for  a  label),  or  a 
simple  01  subscripted  variable  reference.  The  Reference  Processor  is 
called  on  to  evaluate  ihe  variable  leferencc  and  place  a  label  value  on 
the  stack  Recall  that  label  values  ulso  contain  pointers  to  the  IDCW's 
of  labels.  Until  the  IDCW  is  examined,  it  is  not  known  whether  the 
label  bus  Ix-en  defined,  or  even  if  ihe  IDCW  is  for  a  label  at  all.  If  the 
IIK'W  is  not  for  a  defined  label,  a  processing  error  shutdown  will 
result  Otherwise.  Ihe  IDCW  will  contain  the  address  of  a  word  con¬ 
taining  a  Block  instruction  where  execution  will  continue  as  described 
above. 

Procedure  Call,  Parameters,  and  Return 

Ihe  axle  in  the  object  string  for  n  procedure  call  (or  function 
relerenee)  will  in  general  consist  of  three  parts:  a  Name  Table  Pointer 
instruction  lor  the  procedure,  code  lo  evaluate  any  indirect  parameters, 
and  parameter  instructions.  The  code  for  indirect  parameters  is  not 
executed  in-line,  so  if  there  ure  any  indirect  parameters,  then  a 
Transfer  instruction  follows  the  Name  Table  Pointer  instruction 
directed  al  the  first  parameter  instruction  to  be  executed.  Then  for 
each  actual  Indirect  parameter  is  the  code  to  evaluate  that  parameter, 
followed  by  a  Parameter  Return  Instruction  (D6).  (This  instruction  is 
of  course  used  when  Ihe  parameter  is  referenced  to  signal  the  end  of 
the  actual  parameter  code.) 

lastly.  if  there  are  any  direct  or  indirect  parameters,  there  are 
the  parameter  instructions,  which  will  appear  in  the  object  string,  two 
per  word,  in  the  order  opposite  to  the  order  in  which  the  correspond¬ 
ing  parameters  ap|ieur  in  the  source  program.  For  indirect  parameters, 
there  will  lx-  an  Indirect  Parameter  instruction  (D5)  containing  the 
address  of  the  code  to  evaluate  the  actual  parameter.  For  direct 
parameters,  there  will  be  a  Direct  Parameter  instruction  (D4j  contain¬ 
ing  the  address  of  the  IDCW  for  the  actual  parameter. 

After  the  Name  Table  Pointer  instruction  is  encountered,  the 
Reference  Processor  informs  the  Instruction  Sequencer  that  Ihe  identif¬ 
ier  is  for  a  procedure,  The  Instruction  Sequencer  then  begins  looking 
for  parameter  instructions,  ignoring  No-ops  and  executing  at,/  Transfer 
instructions.  The  parameter  instructions  ate  pushed  onto  the  stack, 
one  per  word.  The  first  instruction  which  is  not  a  No-op,  Transfer,  or 
parameter  instruction  will  he  executed  on  return  from  the  procedure 
(the  return  point).  The  parametet  instructions  ate  then  popped  from 
the  stack  (note  that  the  stack  operations  reverse  their  order):  the 
IDCW's  of  the  formal  parameters  are  modified  as  described  above.  If 
the  numbci  of  formal  parameters  does  not  equal  the  number  of  actual 
parameters,  a  processing  error  shutdown  occurs.  Following  these 
operations,  a  block  entry  is  made  at  the  start  of  the  procedure's  object 
axle,  which  is  found  from  Ihe  procedure's  IDCW. 

A  reference  lo  a  direct  formal  parameter  is  identical  to  a  refer¬ 
ence  to  a  Global  variable,  label,  or  procedure,  When  ah  indirect  for¬ 
mal  parameter  is  referenced,  a  word  is  pushed  on  the  stack  containing 
the  state  of  the  Instruction  Sequencer.  Program  execution  then  contin¬ 
ues  at  the  address  designated  in  the  formal  parameter's  IDCW,  just  as 
il  no  parameter  reference  was  in  progress.  (The  code  to  evaluate  the 


actual  parameter  may  itself  contain  indirect  parameter  references.) 
When  the  Parameter  Return  instruction  (D8)  it  encountered,  the  top 
of  stuck  register  contains  the  actual  parameter  (value  or  address)  or  a 
link  to  it.  The  state  of  the  instruction  Sequencer  is  restored  I  rum  the 
topmost  word  of  the  stack  in  memory.  Program  execution  then  contin¬ 
ues  as  before  the  parameter  reference. 

The  Return  instruction  may  or  may  not  return  u  value  (or  locu¬ 
tion),  which  may  or  may  not  be  used.  The  internal  lop-of-stack  regis¬ 
ter  will  contain  the  operand  to  be  returned,  if  uny.  The  block  exit 
mechanism  invoked  by  the  Return  instruction  (D6)  will  delete  the 
memory  space  occupied  by  the  current  stack,  but  will  nut  clear  this 
register.  So  this  register  becomes  the  top  of  stack  for  the  culling  block. 

If  (he  calling  block  is  expecting  a  value,  and  the  register  is  empty,  a 
processing  error  shutdown  will  result.  If  a  value  is  relumed,  and  none 
is  required,  no  processing  error  shutdown  will  result.  This  is  because 
the  End  Statement  instruction,  which  will  follow  a  simple  procedure 
call,  clears  all  operands  from  the  stack,  including  the  contents  of  the 
top-of-stnek  register. 

Input  and  Output 

Input  and  Output  operations  transfer  and  transform  information 
between  memory  and  the  outside  world.  Six  I/O  stutus  bits  arc  main¬ 
tained  by  the  Instruction  Sequencer  lo  indicate  the  type  and  mode  ot 
the  I/O  operation.  When  the  Input  instruction  (HO)  is  encountered, 
the  Input  I/O  status  bit  is  set,  and  the  remaining  bits  are  deured.  The 
Output  instruction  (81)  causes  the  Output  I/O  stutus  bit  to  be  set  and 
the  others  to  be  cleared. 

The  remaining  four  bits  are  used  lo  indicate  the  I/O  mode,  hol¬ 
lowing  the  Input  or  Output  instruction  in  the  object  string  there  may 
Ire  a  String  instruction  (A3),  a  Data  instruction  (Al),  or.  for  Input 
only,  an  Exact  Instruction  (A4)  or  an  Empirical  instruction  (AS).  For 
each  of  these  instructions  there  is  a  corresponding  I/O  status  hit  which 
is  set  when  the  instruction  is  encountered.  The  List  mode  is  indicated 
by  the  absence  of  any  other  I/O  mode. 

The  I/O  mode  determines  the  type  of  data  transformation  to  be 
performed.  In  memory,  the  data  may  be  a  scalur  vulue  in  data  string 
or  numeric  field  format,  or  a  structure  in  tree  formut.  In  the  outside 
world,  the  data  exists  as  an  ASCII  character  string.  Structures  in  the 
outside  world  are  represented  explicitly  using  the  characters  and 
“>"  to  delineate  each  structure  or  substructure,  and  the  field  mark 
character  T  is  used  lo  separate  adjacent  scalar  components.  It  is  the 
Input/Outpui  Processor's  responsibility  to  transform  data  between  this 
explicit  structure  format,  and  the  internal,  linear  format,  if  the  I/O 
mode  calls  for  such  s  transformation. 

Data  may  be  directed  to  or  from  a  number  ol  different  I/O  dev¬ 
ices.  If  the  default  device,  with  associated  device  number  zero,  is  nut 
to  tie  used,  then  following  the  I/O  type  and  mode  instructions  .  there 
will  lie  code  for  an  expression  for  the  device  number,  followed  by  a  To 
instruction  (AO)  for  output,  or  a  From  instruction  (Hll)  lor  input 
(After  the  To  or  From  instruction,  a  Comma  instruction  is  expected 
but  ignored.)  The  code  to  evaluste  the  expression  is  executed,  and  a 
value  or  link  is  left  on  the  stack.  The  To  or  From  instructions  force 
the  value  lo  be  placed  on  the  stack,  and  then  to  be  inlegerized,  us  fur 
subscripts.  The  two  lea*  significant  (BCD)  digits  are  extracted  and 
designated  ax  the  device  number  for  the  operation.  The  Channel  Con¬ 
troller  associate*  devices  with  device  numbers. 

Next  lo  appear  in  the  object  siring  will  lie  the  I/O  items, 
separated  by  Comma  instructions  (AC).  The  Comma  instruction 
causes  the  value  of  the  preceding  I/O  item  to  be  placed  on  the  stack 
(or  output;  for  input,  it  causes  the  input/Outpul  Processor  to  gel  an 
input  value  which  is  then  asugned  to  the  preceding  I/O  item  The  Iasi 
I/O  item  is  followed  by  either  an  End  Statement  or  an  End  Block 
instruction,  each  of  which  is  treated  as  having  been  preceded  by  a 
Comma  instruction.  In  addition,  fur  uulpul.  these  two  instructions 
cause  the  Input/Output  Processor  to  output  the  values  on  the  stack, 
starting  at  the  bottom  and  ending  at  (he  top. 

For  Input  DatH,  there  will  he  no  I/O  items  since  boih  variable 
names  and  values  come  from  the  outside  world.  Each  I/O  item  for 


Output  Data  will  be  a  simple  variable  rclcrcncc  (Name  I  able  Pointer 
instruction).  For  the  remaining  input  modes,  an  lit)  uent  may  bo  a 
simple  or  subscripted  variuble  reference  or  a  procedure-  reference 
(which  must  eventually  return  a  simple  or  subscripted  variable  reler- 
cnee).  For  the  remaining  output  modes,  an  I/O  item  may  lie  nnv 
expression.  The  code  for  each  I/O  item  is  executed  exactly  as  it  no  I/O 
instructions  hud  been  encountered. 

For  all  output  modes,  the  actual  value  of  the  I/O  item  must  he 
placed  on  the  stack.  A  scalur  value  in  numeric  field  formal  is  con¬ 
verted  to  data  siring  format  by  the  Format  Processor.  If  the  vulue  is 
for  it  structure,  a  temporary  stuck  is  created  onto  winch  the  Reference 
Pnx'essnr  places  the  vulue.  convening  it  from  free  format  to  linear  lor- 
mat  The  structure  is  then  copied  to  the  regular  stuck  and  any 
numeric  scular  components  arc  converted  to  data  string  format. 

For  Output  Data,  the  variable's  name  musi  be  placed  before  the 
value  on  (he  stack.  The  name  is  found  in  the  one  or  more  words 
preceding  the  variable's  IDCW,  a  pointer  to  which  will  exist  in  the  top 
of  stuck  register  as  u  result  of  the  just  executed  Name  Table  Pointer 
in-traction.  If  the  variable  is  scalar  valued,  a  word  is  pluced  before 
and  after  the  value  of  the  variable  on  the  stack  liegmning  with  the 
characters  pll  and  FI,  respectively.  The  Inpul/Output  Processor  con¬ 
verts  each  of  these  words  to  (he  field  mark  character  in  delineate 
the  value  in  the  output. 

Km  Input  Data,  the  lnput/Oulput  Procesxoi  calls  on  the  Transla¬ 
tor  to  extract  the  variable  names  and  values  (rnm  the  input  character 
string  und  perform  the  assignment.  For  Input  List  and  Input  String, 
the  Input/Output  Processor  will  leave  u  value  on  the  stack  on  top  ol  u 
link  word  pushed  onto  the  stack  us  a  result  of  executing  the  code  for 
the  I/O  item.  The  Reference  Processor  is  called  on  In  perform  an 
assignment  operation,  just  as  if  a  Left  Assign  instruction  had  Ikcii 
encountered. 

The  Exact  und  Empirical  input  modes  arc  used  to  convert  input 
values  to  numeric  field  format.  A  temporary  stuck  is  created  onto 
which  (he  Input/Outpul  Processor  places  the  input  value.  The  value  is 
moved  to  the  regular  stuck  and  the  scalar  value,  or  each  scular  com¬ 
ponent.  which  must  he  a  number.  Is  converted  to  numeric  field  (annul. 

If  a  precision  lag  was  given  in  the  Inpul  vulue,  it  is  used  in  the  waiver* 
sitin',  otherwise,  the  precision  is  determined  by  I  he  input  mode  Tile 
assignment  operation  is  then  carried  out  os  for  the  I  M  and  String 
input  modes 

Pause  and  System 

The  Pause  Instruction  (do)  und  System  instruction  (97)  cuusc  the 
Central  Processor  to  loud  the  error  code  register  with  the  one  byte 
opcode  and  then  shut  down.  The  hardwired  System  Supervisor  notices 
that  the  Central  Processor  has  shut  down  and  examines  the  error  code 
register.  If  fhe  instruction  was  Pause,  then  flic  System  Supervisor 
deletes  the  process  from  the  Central  Processor's  run  queue,  'litis  hus 
(he  eflcct  of  halting  the  execution  of  that  prixcss.  A  paused  process  is 
restarted  when  the  user  presses  the  Continue  button  on  his  terminal.1' 
If  the  instruction  was  System,  then  Ihe  System  Su|x'rvtsor  executes  a 
previously  defined  memory  siring  of  control  words.  The  System 
instruction  is  used  in  "privileged''  softwurc  lo  modils  low  level  system 
dutu  structures  which  are  normally  maintained  h)  hardware.  Adding 
oi  deleting  a  process  Irian  a  prix'cusor  queue  is  a  typical  cxumplc  ol 
the  use  of  Ihe  System  instruction. 

l-ogical  Memory 

As  mentioned  previously,  SYMBOL  differs  Imm  must  von  Neu¬ 
mann  computers  in  that  the  memory  structure  is  not  organized  us  u 
contiguous  set  of  sequentially  numbered  storage  cells  Instead,  a  "logi¬ 
cal  memory"  structure  implemented  by  Ihe  Memory  ControlkM  is 
imposed  on  top  ol  the  virtual  memory  system.  The  Memory  Cult- 
trollct  lakes  each  virtual  page  and  divides  it  up  nun  three  sections 
The  lost  lour  words  ot  Ihe  page  ate  the  "Page  I  le.ulcis".  which  ionium 
pointers  and  status  mlnrmulinn.  One  ol  the  ponucis  links  pages 
together  in  a  Inrwnrd  linked  list;  SYMBOL  nominally  used  thicc  ol 
these  "Page  Lists'  lor  each  terminal.6  'Itie  lust  Page  I  isl  was  Ini  the 
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user's  source  iwogrum.  the  second  for  user  daw.  the  stuck  and  Name 
Tables,  and  the  third  lor  Oh (eel  String.  The  Cage  Headers  also  indi¬ 
cate 'available  spare  in  the  page  hy  status  Islts,  and  iHitsidc  the  page  b\ 
a  Space  Availiihle  l  ast  pointer  which  points  the  the  nest  page  that  con¬ 
tains  available  space  The  remainder  of  .he  page  is  divided  into 
twenty-eight  eight-word  “Groups",  and  twenty-eight  “Group  tank 
Words".  The  Memory  Controller  then  organizes  these  contiguous  eight 
word  “groups"  into  mcntoiy  strings  with  a  doubly  linked  list.  All  of 
the  processors  other  than  the  Memory  Controller  then  view  this  logical 
memory  structure  as  the  fundamental  memory  organization  ol  the 
machine. 

The  Privileged  Memory  Operatloue 

There  are  sixteen  Instructions  which  operate  directly  with 
memory  addresses  to  read  or  alter  storage,  These  Instructions  arc 
issued  hy  the  hardware  processors  or  hy  systems  programs  which  have 
been  translated  in  "privileged”  mode.  A  request  for  memory  to  the 
Memory  Controller  consists  of  a  92  bit  value  consisting  of  three  fields, 
First  Is  a  four  hi)  field  fttr  the  Huge  Mst.  followed  hy  a  twenty-four  hit 
ahstiluic  vlnuill  address  field,  und  a  sixty-four  bit  data  field.  These 
fields  are  transmuted  in  the  Memory  Controller,  which  may  use  and 
modify  them,  returning  them  to  the  originating  processor  Bach 
memory  requeat  is  ulao  accompanied  by  the  terminal  number.  Because 
words  in  memory  are  not  necessarily  contiguous,  no  address  indexing 
calculations  cun  he  performed.  Bor  this  reason  the  memory  operations 
ure  of  the  flavor,  “Here  is  an  address,  get  me  the  data  ai  that  address 
und  tell  me  what  the  address  of  the  next  word  ls.“  More  specifically  the 
sixteen  memory  operations  ate  at  follows: 

Assign  Group!  Used  to  alkxate  a  new  memory  string.  If  the  transmit¬ 
ted  address  it  non-zero,  the  Memory  CfmiroUcr  will  try  In  allocate  a 
group  from  the  same  page.  If  no  group  la  available  on  that  page,  an 
empty  group  will  lie  looked  for  by  following  the  Space  Available  List 
pointer  lo  a  page  which  has  free  space.  If  there  are  no  pages  on  tho 
Page  l .1st  with  space,  a  new  page  will  he  allocated  from  the  system 
Ayuiluhlc  Page  l.tsi.  If  the  iransmifled  address  field  is  zero,  then  the 
Memory  Ctadrollet  will  allocate  a  group  from  the  same  Page  List  as 
specified  in  the  page  lial  field.  If  the  transmitted  page  list  field  and 
ihc  address  field  are  faith  zero,  then  a  new  page  is  allocated  anti  Ihc 
fiut  group  on  the  page  will  he  allocated.  The  relumed  address  Is  the 
addresa  of  the  first  word  of  the  assigned  group. 

Batch  and  Fellow  i  Returns  the  uata  at  the  specified  address  and 
returns  the  address  of  the  following  word  In  the  string, 

B'etch  Ravers*:  Returns  (lie  preceding  word  in  the  string  and  Its 
address. 

Follow  and  Fetch:  Returns  the  data  and  address  of  the  word  following 
the  specified  uddrets. 

Merc  and  Ani*n:  Stores  the  data  at  the  indicated  address  and  returns 
the  address  of  the  successor  word.  If  no  successor  word  eaisls.  a  new 
group  is  ulkicutcd  as  Indicated  in  the  Assign  Group  Instruction,  and  Is 
linked  onto  the  currcni  storage  siring. 

Store  Onlyt  Stores  the  data  at  the  Indicated  address.  The  returned 
address  Is  changed  hy  adding  ohc  to  the  low  order  three  hits  modulo 
eight.  (This  has  the-  effect  of  wrapping  the  address  around  in  the 
group.) 

Store  and  Insert:  Stores  the  word  in  the  Indicated  add  res  and  returns 
the  successor  address.  If  the  tranunltted  address  specifies  the  last  word 
of  a  group,  then  a  new  group  is  allocated  and  inserted  between  the 
group  of  the  transmitted  address  and  the  group  which  followed  it. 

Inaort  Group:  A  new  group  Is  allocattd  und  inserted  after  the  group 
specified  by  the  transmitted  addresa,  The  returned  addresa  is  thut  of 
the  first  word  of  the  new  group. 

Delete  String:  Deletes  a  memory  string;  tha  transmitted  address  must 
he  that  of  the  first  group  of  the  string.  The  associated  Page  List  must 
also  he  supplied  so  that  the  string,  when  reclaimed,  can  be  returned  to 
the  proper  available  space  list  If  the  Fife  List  supplied  is  that  of  the 
user  data,  then  pointers  lo  substructures  will  he  looked  for.  and  that 
s|«tcc  will  he  deleted  also. 


Oclctc  to  Bind  of  String:  Obtains  the  address  of  the  succeeding  group 
and  reclaims  lhal  and  all  billowing  groups  The  associated  Page  List 
must  also  he  supplied  so  lhal  the  reclaimed  part  of  ihc  airing  can  be 
returned  lo  ihc  proper  available  space  list  II  the  Page  List  supplied  is 
lhal  of  the  user  data,  then  pointers  lo  substructures  will  be  looked  for, 
ami  that  space  deleted  also. 

Deleft  Page  List:  The  Page  List  supplied  will  be  reclaimed  for  the  ter¬ 
minal  on  which  the  request  was  made.  This  operation  is  handled  by 
(he  Memory  Reclaimer. 

Reclaim  Group:  If  the  transmitted  addresa  Is  zero,  fetch  the  top  of  the 
terminal's  garbage  stack;  otherwise  link  the  group  onto  it*  page’s  avail¬ 
able  group  list. 

Fetch  Direct:  Used  for  fetching  one  of  the  terminal  header  registers, 
or  any  absolute  core  addresa  (rather  than  a  virtual  addresa).  The  data 
at  the  real  memory  address  is  returned.  The  returned  address  is 
changed  hy  adding  one  to  the  low  order  three  bits  modulo  sight. 

Sloe*  Direct:  Stores  the  data  at  the  real  memory  addrees  given.  The 
relumed  address  is  changed  by  adding  one  to  the  low  order  three  bits 
modulo  eight. 

B'etch  Terminal  Header!  Used  to  fetch  one  of  the  21  header  registers 
associated  with  each  terminal.  The  Fetch  Terminal  Header  iriatructlon 
differs  front  the  Fetch  Direct  inatnaction  in  that  the  Memory  Controller 
automatically  Inserts  the  terminal  number  Into  the  addnas  field.  This 
allows  the  addresa  of  a  particular  tarminal  header  lo  ba  specified  in  a 
terminal  independent  manner  The  tddress  fetched  la  tha  specified 
physical  address  with  the  terminal  number  added  lit  shifted  left  by 
three  bits. 


Store  Terminal  Header  Used  to  store  one  of  the  21  bender  registers. 
The  addrem  stored  into  is  the  specified  physical  address  wtlh  the  termi¬ 
nal  number  added  in  shifted  left  by  three  bite. ' 

Cunctuete* 

The  SYMBOL  Instruction  set  has  now  bebh  described  In  enough 
detail  lo  show  the  oowpDxitlrs  of  implementing  a  high  level  Instruction 
mi  in  hardwate.  Many  further  detatfetti*.  but  thee*  are  (datively 
minor,  and  would  probably  not  hit  of i interest. S* uhe  feeder-  One  of 
the  reasons  the!  the  instruct  gwt  Id  had  not  tags  describ'd .fariier  was 
that  the  users  of  the  machine  ware  not  suppifesd  to  ga*fi  to  tow* 
about  the  machine,  level  instructiM  sat,  i‘ 

SPL  was  the  language  bH  tha  machine, 
actually  turned  dpi  to  be  IW  cdSd.  t«w  L 
lion  set,  Only  by  s  protongSd  «mtddes  j 
thin  details  of  the  leatiuction  sht  wd#  Hid  i 
many  InaffUdewsIm  which  Wdtia^tn  fit*  'mtuhtue.1’  Ukrt,.  _ . 

Is  far  from  Ideal:  po$nm  measmemtnfs1*  *ws  thatHHybuttsdldg', 
used  wax  very  Itssffidwtti  fiesde  of  thsffriddfMSsudal'iaui  t»  wcused  \ 
when  it  is  realized  lhal  SYMBOL  wad  ad  (iipthwatadnfrtldnc.'and 
that  the  designers  were  Umtksd  In  the  digit  they  could  upend  in  optim¬ 
ization.  N  evert  he  Ices,  we  feel  M  Is  Important  tdsstifhs  feetiuostun  set  be 
documented  ns  It  was  impkmtemed.  This  paper,  qaad  In  odtjunetlon 
with  tho  previously  published  papers  completes. ppt  dnctiiuanlWiro. 
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ABSTRACT 

The  development  of  debugging  tools  on  the  high 
level  language  SYMBOL  computer  is  described,  The 
software  system  developed  allows  a  detailed  interac¬ 
tive  Investigation  of  the  dynamic  and  static  program 
structure  and  user  variables  entirely  at  the  source  pro¬ 
gram  level  for  a  procedural  block-structured  program¬ 
ming  language.  Source  statements  are  "de-corn  pi  led" 
from  the  object  code,  descriptors  and  hardware  main¬ 
tained  type  tags  allow  the  unambiguous  interpretation 
of  data  values.  Language  constructs  in  the  SYMBOL 
programming  language  which  aid  in  debugging  are 
also  described.  Comments  are  made  on  the  evalua¬ 
tion  of  the  system  and  how  the  debugging  environ¬ 
ment  was  affected  by  the  high  level  language  architec¬ 
ture  of  the  SYMBOL  machine. 

Introduction 

One  of  the  motives  often  suggested  for  High  Level  Language 
Computers  has  been  that  they  make  proram  debugging  easier.  The 
high  level  language  SYMBOL  computer  system' -2-}  provided  a  unique 
opportunity  to  test  out  this  hypothesis.  In  this  paper  the  state-of-thc- 
art  debugging  tools  developed  for  SYMBOL  will  be  presented,  along 
with  a  description  of  how  nnd  why  these  took  were  developed.  The 
exposition  of  these  debugging  tools  is  important  for  two  reasons.  First, 
it  documents  how  debugging  was  achieved  on  perhaps  the  moat 
advanced  high  level  language  computer  yet  constructed.  Second,  it 
completes  documentation  on  what  many  uten  observed  to  be  the  most 
important  feature  of  SYMBOL  -  the  high  level  language  programming 
and  debugging  environment.  Only  by  examing  this  user  visible  system 
software  imposed  on  top  of  the  SYMBOL  architecture  can  one  make  a 
judgement  on  the  effect  the  high  level  architecture  had  on  the  debug¬ 
ging  environment. 


Unveiled  in  1971,  the  SYMBOL  computer  system  had  as  its 
prime  goal  to  demonstrate  with  a  full-scale  working  computer  that  a 
procedural  general-purpose  programming  language  and  a  Urge  portion 
of  a  time-shared  operating  system  could  be  implemented  directly  in 
hardwire.  This  approach  waa  intended  to  show  a  marked  improve¬ 
ment  in  computational  rates  over  conventional  systems.  Almost  every 
aspect  of  the  system  was  unique,  from  its  eight  special  function  proces¬ 
sors  to  the  totally  new  SPL4  programming  language.  While  the  system 
was  capable  of  running  totally  without  system  software  this  was  rarely 
done.  System  software  for  SYMBOL  greatly  contributed  to  the 
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“system''  interface  which  appeared  to  the  user  as  independent  of  the 
hardware  of  software  implementation. 

Early  Debugging 

Without  software  SYMBOL  provided  no  fadlilies  for  debugging 
programs  with  execution  errors.  Consequently  one  of  the  most  useful 
programs  early  in  the  project  was  a  traditional  interactive  memory 
dump.  This  fairly  small  program  was  effective  for  program  debugging 
if  one  was  familiar  with  the  high  level  instruction  set  and  data  organi¬ 
zation  Upon  detection  of  an  execution  error  the  System  Supervisor 
would  suspend  the  user  program,  save  the  24  “header”  registers  in  a 
known  place,  and  then  start  up  a  "Monitor"  program  on  the  user's  ter¬ 
minal,  From  the  Monitor  the  user  could  enter  the  dump  program  to 
look  at  his  dead  program  and  its  source.  As  with  most  users,  the  first 
questions  to  be  answered  are  why  and  where.  The  Hnt  place  to  look 
was  in  the  AH1  header,  because  one  of  the  bytea  was  an  "Error  Code 
Character”;  this  was  then  translated  to  English  by  looking  at  one  of  the 
many  Engineering  Reference  Cards  lying  around.  Once  the  nature  of 
the  error  was  determined  the  current  object  code  address  was  taken 
from  the  left  half  of  the  AH2  header.  By  dumping  five  or  six  words 
of  object  code  al  this  address  the  user  could  usually  encounter  a  Source 
Pointer  opcode,  generated  by  the  Translator  at  each  semicolon  in  the 
source  program.  The  address  field  of  the  Source  Pointer  instruction 
was  an  absolute  address  pointing  to  the  corresponding  line  in  the  origi¬ 
nal  source  code.  This  entire  process  could  be  done  at  a  terminal  in  leas 
than  a  minute. 

The  Sourer  Pointer  Problem 

Using  the  source  pointers  left  by  the  Translator  to  find  the  source 
line  was  straightforward,  and  so  was  soon  programmed  into  the  termi¬ 
nal  Monitor.  Error  messages  were  also  automatically  translated  to  a 
more  understandable  English  message.  While  this  system  of  tracking 
down  the  source  was  simple  it  had  the  proverbial  "Achilles'  heel"  that 
made  it  untrustworthy  and  potentially  dangerous.  A  program  could  be 
interrupted,  the  Monitor  and  its  subsystems  invoked,  and  then  the  pro¬ 
gram  could  be  resumed  at  the  point  of  Interruption.  U  the  user  edited 
the  program  source  and  then  resumed  execution  of  the  object  code,  the 
source  pointers  in  the  object  code  were  potentially  invalid.  The  situa¬ 
tion  is  not  unlike  the  "dangling  reference"  problem  encountered  in 
block  structured  languages.9  In  short-lived  user  programs  this  turned 
out  occur  infrequently;  systems  programs  on  the  other  hand  might  be 
executing  the  same  object  code  for  weeks  while  the  source  was  being 
modified,  allowing  source  and  object  files  which  differed  greatly  in 
content  and  date  of  last  modification.  Debugging  systems  programs 
then  brought  us  hack  to  square  one. 
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More  Problem* 

SYMBOL  provided  a  rather  inadequate  hardware  text  editor 
which  was  limited  to  specially  designed  terminals;  this  lead  to  the 
implementation  of  software  text  editors.  These  editors  initially  used 
Sl’L  structures  (arrays)  for  storing  and  manipulating  text.  Since  the 
Translator  required  that  the  source  program  lx:  in  a  contiguous 
memory  string,  the  user's  source  program  was  copied  from  the  text 
structure  to  the  memory  string  just  prior  to  translation.  Source 
pointers  generated  by  the  Translator  therefore  pointed  to  the  copy, 
requiring  a  search  through  the  structure  to  find  the  original  source  and 
Its  location.  Both  the  copying  and  searching  processes  were  painfully 
slow;  eventually  this  type  of  editor  was  replaced  with  one  which 
worked  directly  on  memory. 

Decomptyt 

A  rtthsr  unique  approach  was  taken  to  solve  the  above  prob¬ 
lems;  a  program  was  ronettuftrd  which  "de-cntnpikri"  SYMBOL  object 
code  beck  Info  SPL  source  statements.  The  decompilation  process  was 
greatly  facilitated  because  the  SYMBOL  instruction  set  was  so  similar 
to  the  SPL  language  and  by  the  direct  and  simple  manner  in  which  the 
Translator  generated  object  code.  Decompilation  was  remarkably 
effective  in  re-creating  source;  in  most  instances  the  decompiled  state¬ 
ment  differed  from  the  original  source  only  in  minor  ways  such  as  the 
number  of  blanks,  carriage  returns,  the  case  of  letters,  and  the  omis¬ 
sion  (in  the  decompiled  version)  of  redundant  parentheses. 

Execution  Error  Dlaggnettea 

When  an  execution  error  occurred,  the  ueer  process  was 
suspended  and  the  System  Supervisor  invoked  the  Monitor  on  the 
appropriate  terminal.  Use  of  the  decompile  program,  coupled  with  a 
program  to  interpret  data  values  on  the  evaluation  stack,  allowed 
excellent  diagnostics  to  be  given  entirely  in  terms  of  the  high  level 
source  program.  On  an  execution  error  the  Monitor  generated  the  fol¬ 
lowing: 

1.  Notification  that  an  execution  error  had  occurred 

2.  The  nature  of  the  error  (in  an  understandable  form|. 

3.  The  statement  at  which  the  error  occurred. 

4.  An  arrow  beneath  the  source  line  pointing  to  the  particular 
operator  or  operand  causing  the  error. 

5.  If  the  error  involved  a  monadic  operator  then  its  operand 
was  printed.  If  the  error  involved  a  dyadic  operator  (hen 
bosh  operands  were  identified  and  printed. 

For  example,  division  by  zero  in  one  program  generated  the  following 
diagnostic: 

•••  EXECUTION  ERROR  (ZERO  DIVISOR  -  CODE  JU) 

IN  THE  FOLLOWING  XTATEMENT 

«l  »  qmin  •  (ml  •  m2  /  (|imms  /  rtw)  /  (mimmn  i  (I  -  ID); 

t 

RIGHT  OPERAND 

I  00  I 

LEFT  OPERAND: 

midterm:  1  -  C6M  ( 

MONITOR  IS  NOW  IN  CONTKtH 


At  this  point  the  Monitor  wuited  for  commands  from  the  user 
Urgical  choices  would  be  to  enter  the  text  editor  und  correct  the  proti 
iem  ot  to  enter  the  INOUIRt  subsystem  for  lurthcr  examination 

INQUIRE 

Knowing  the  source  line  and  operands  involved  in  an  error  is 
only  the  first  step  in  providing  good  high  level  language  debugging 
tools.  For  proper  debugging  we  (eel  it  is  necemary  to  lx  able  lo  exam¬ 
ine  the  contents  of  variablei,  lo  examine  the  currently  active  calling 
sequence  down  to  the  source  line  invoking  (he  call  of  each  active  pro¬ 
cedure,  and  to  examine  the  state  of  the  expression  evaluation  stack. 
Such  examination  should  he  available  after  an  execution  error  has 
occurred  or  at  any  time  during  the  normal  running  of  a  program.  Ibis 
is  the  function  of  the  INQUIRE  tubeyvtem.  The  INQUIRE  subsystem 
is  the  primary  meant  of  examining  the  user  program  variables  and 
block  structure  of  the  program.  When  entered,  the  “command 
environment"  is  act  to  the  block  that  was  in  execution  when  the  user 
program  was  interrupted. 

INQUIRE  responds  to  commands  from  the  user  in  the  following 
way.  Entering  an  identifier  causes  the  value  of  thut  identifier  in  be 
printed,  providing  that  it  it  known  to  the  command  environment.  Spe¬ 
cial  qualifications  are  given  to  several  claiact  of  variables,  An  identif¬ 
ier  that  has  never  been  referenced  is  tagged  as  “unreferenced  null". 
Procedure  names,  labels  and  switch  components  are  tagged  only  ai  to 
type.  Parameters  are  identified  and  the  parameter  linking  is  followed 
to  the  calling  environment  to  reaolvc  the  parameter.  Global  variables 
are  identified  and  the  value  in  the  denning  envmmment  it  printed. 
Each  scalar  element  of  a  vector  It  printed  along  with  its  subecripl  list. 
(Successive  nulls  are  grouped  together  In  an  attempt  to  save  paper.) 
Individual  elements  of  a  vector  can  alto  be  obtained.  All  identifiers 
known  lo  a  block  can  he  obtained  with  the  DATA  command 

Identiftera  from  other  than  the  current  block  are  available  us 
welt;  one  of  the  unique  features  of  INQUIRE  is  the  ways  in  which 
various  blocks  can  he  traversed.  For  example,  the  value  o(  un  identif¬ 
ier  in  the  bkxk  calling  the  block  of  the  current  command  environment 
is  obtained  by  preceding  the  Identifier  name  with  an  "up-arrow"  char¬ 
acter  (i  or  ’).  This  specifies  that  the  identifier  is  lo  be  looked  for  by 
going  out  one  level  from  the  command  environment,  according  lo  the 
dynamic  nesting.  In  a  similar  manner,  any  number  of  dynamically 
nested  blocks  may  be  traversed  by  preceding  the  identifier  name  with 
the  appropriate  number  of  up-arrows. 

The  static  program  netting  can  also  be  used  to  specify  a  particu¬ 
lar  block.  Before  this  can  be  accomplished  however,  a  BLOCK  or 
PROCS  command  must  be  given.  The  BLOCK  command  prints  the 
static  bkxk  structure  of  the  program  and  assigns  an  integer  value  to 
each  block.  This  number  provides  a  unique  naming  for  each  block. 
The  character  ">"  followed  by  one  of  these  integer  values  will  change 
the  current  command  environment  to  the  specified  block  number.  Set¬ 
ting  the  command  environment  to  a  block  which  was  not  a  member  of 
the  calling  sequence  allows  looking  at  static  vqriitblu*  hut  precludes 
obtaining  the  values  of  any  formal  purameters  since  u  non-activc  pro¬ 
cedure  hits  no  calling  point  for  parameter  linkage  Selection  of  a  mm- 
active  hlixk  also  prohibits  use  of  the  up-arrow  commands.  In  addition 
to  printing  the  black  structure  of  a  program,  the  BLOCK  command 
prints  the  names  of  all  identifiers  used  in  each  Nock  and  an 
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abbr  tinted  lag  as  lo  (heir  Jala  type,  e  g. .  scalar,  structure,  label,  pro¬ 
cedure.  etc.  The  HROCS  command  is  similar  to  •*  (LOCK  com- 
...arh  ith  the  exception  that  it  prints  only  the  N&.»  structure. 

The  WIIKHi'  command  locates  the  statement  in  execution, 
prints  it  on  the  console  device,  and  then  pauses.  Pressing  the  Con¬ 
tinue  button  vi1'.  -au*  one  succeeding  statement  to  be  printed  before 
parsing  agaili.  This  sequence  is  exited  by  pressing  the  FI)  special  func¬ 
tion  button.  The  most  usef.it  i  of  the  WHERE  command  is  in 

conjunction  with  the  "up-arrov,  ,ure  described  previously,  A  com¬ 
mand  consisting  of  tWHERE  prints  the  statement  that  called  the  pro¬ 
cedure  in  execution,  and  thereby  reveals  the  name  of  the  procedure 
and  its  actual  parameters.  In  this  manner  the  calling  sequence  may  he 
examined  any  number  of  levels  on  »  very  specific  bads  Since  xn 
entire  statement  is  primed  using  the  "'HERE  t  .  inand,  a  more 
specific  referenc  is  needed  to  isolate  ll .e  .—.ct  point  of  execution.  A 
large  expression  may  contain  many  t  peraiors  and  ofierands.  for  exam¬ 
ple,  the  statement  in  the  diagnostic  of  the  previous  section  contained 
severe!  division  operations.  To  isolate  the  exact  point  of  execution  or 
error  a  pointer  is  prim-- '  beneath  the  statem>n(  directly  below  the 
appropriate  operand  or  operator. 

Examination  of  expressions  which  may  nave  been  partially 
evaluated  is  possible  using  the  STACK  command.  This  command 
prints  the  top  entry  of  the  slack  and  then  pauses.  Pressing  Continue 
prints  one  successive  stack  entry  and  tnen  pauses  again  it  the  bottom  of 
stack  has  not  been  reached.  Pressing  the  FU  button  before  the  buitnm 
of  stack  is  reached  will  cause  a  return  to  .he  cr-  .and  mode.  As  SPL 
is  a  block-sfrictu'ed  language,  ther"  is  a  separate  stack  asaociated  with 
e.cch  active  block.  The  stacks  of  other  active  procedures  are  accessed 
by  preceding  the  STACK  command  with  the  desired  number  of  up- 
arruws  or  by  first  entering  the  appropriate  block  via  the  “>"  command 

If  a  program  was  interrupted  by  pressing  the  internin'  ,.iy.  the 
program  may  N;  resumed  at  the  point  of  interruption  by  using  the 
RESUME  command  or  at  a  label  by  using  the  GO  TO  command. 
The  GO  TO  command  has  the  restriction  that  the  label  must  be  in  a 
hloch  which  is  currently  active.  The  RESUME  command  may  not  be 
used  after  an  execution  error  although  GO  TO  may  be  used  regardless 
of  the  cause  of  the  interrupt 

I!  an  idcntifui  has  an  ON  block  associated  with  it.  that  ON 
blink  may  be  enabled  or  disable'!  from  INOUIRE  A  inot;  detail 
description  of  ON  blocks  follows. 

A  brief  description  of  INOUIRE  commands  is  available  from  the 
terminal  with  the  HEl.P  command.  A  listing  of  the  HEI.P  text  is 
given  in  Appendix  I.  Appendix  2  shows  a  sample  terminal  session 
using  INOUIRE. 

ON  block* 

ON  blocks  are  an  SFE  language  construct  extremely  useful  for 
dc'iugging.  An  ON  block  is  similar  lo  a  procedure,  in  that  il  is  a 
scries  of  statements  invoked  from  some  calling  point,  Unlike  pro¬ 
cedures.  however,  invocation  of  an  ON  Nock  is  caused  by  the 
occurrence  of  an  implicit  event  specified  by  a  list  of  names  following 
the  ON  declaration.  It  the  list  contains  a  variable  name,  the  ON  Nock 
will  tv  invoked  immediately  after  an  assignment  to  that  variable 
vx-urs  It  the  list  contains  a  label,  the  ON  block  will  he  invoked  upon 


encountering  a  GO  TO  sial  ;mcnt  to  that  label  before  the  transfer  actu¬ 
ally  takes  place.  If  the  list  contains  a  procedure  name,  the  ON  Nock 
wiil  be  invoked  upon  encountering  a  call  lo  that  procedure  before 
entry  to  the  procedure  takes  place.  If  the  list  contains  the  word 
INTERRUPT,  the  ON  block  will  be  invoked  when  the  user  presses 
one  of  the  function  buttons  (FI  thru  FI5). 

The  ON  block  facility  bears  a  tetembUnce  to  the  PL2I  ON 
CHECK  condition.  The  major  difference  is  that  in  SPL  multiple  ON 
Nocks  are  allowed  to  exist  within  a  particular  environment  (scope)  and 
that  the  invocation  of  ON  blocks  can  be  controlled  selectively  for  indi¬ 
vidual  identifiers.  The  IBM  PL/I(F)  compiler  makes  no  provision  for 
dynamically  enabling  or  disabling  the  CHECK  condition,  and  while 
the  ON  CHECK  units  may  be  dynamically  switched  around,  such 
switching  applies  equally  to  all  variaNes  lo  which  the  CHECK  condi¬ 
tion  applies  In  SYMBOL  invocation  of  an  ON  block  for  a  particular 
identifier  is  controllable  by  the  SPL  ENABLE  and  DISABLE  state¬ 
ments. 

A  typical  use  of  an  ON  Nock  is  shown  in  Figure  1,  which  illus¬ 
trates  a  method  lo  discover  where  a  variable  is  assigned  undesired  (or 
desired)  values.  The  value  of  I  will  be  printed  every  time  it  is  modi¬ 
fied  and  the  user  can  then  decide  wheth  r  to  continue,  or  interrupt  his 
program  and  diagnose  further  with  INQUIRE.  Once  the  user  is  satis¬ 
fied  that  the  particular  portion  of  the  program  being  monitored  by  an 
ON  block  is  behaving  properly  the  ON  Nock  can  be  disabled  from 
INQUIRE.  The  implementation  is  a  major  advance  over  what  is  pos- 
sibT  in  moat  systems  in  that  no  extra  code  needs  to  be  generated  to 
invoke  an  ON  block  nor  does  the  program  have  to  be  recompiled  to 
turn  on  or  off  the  invocation  of  an  on  Nock.  This  has  major  benefits 
in  terms  of  execution  efficiency  and  the  ability  to  debug  non-stop  pro¬ 
grams,  not  to  speak  of  the  time  saved  in  editing  and  re-compiling  pro¬ 
grams  after  changing  the  debugging  options.  The  ON  Mock  is  also  s 
clean  way  of  debugging  a  program  in  that  it  concentrates  the  debug¬ 
ging  code  in  one  place,  in  contrast  to  spreading  debug  I/O  throughout 
a  program;  this  ptactically  eliminates  needing  to  “clean  up”  a  program 
after  debugging. 

ON  I;  NOTE  This  Mock  invoked  whenever  1  is  assigned  to; 

GLOilAL  I; 

OUTPUT  I  The  value  of  I  is  1. 1; 

PAUSE; 

END 

Figure  I.  Simple  ON  Mock 

The  descriptor  orientation  of  SYMBOL  was  a  major  factor  in  the 
efficient  implementation  of  the  ON  Mock  facility.  Deacriptocs  were 
sixty-four  Mts  long  and  contained  sixteen  tag  Mts  and  two  twenty-four 
Nt  address  fields.  An  identifier  with  in  aamdated  ON  block  had  the 
left  address  field  pointing  lo  the  identifier  value  and  the  right  address 
fi-.ld  pointing  to  the  start  of  the  object  cod r  of  the  ON  block.  The 
ENABLE  and  DISABLE  statements  either  set  or  reaet  and  “ON 
Enabled"  Ni  in  the  tag  field.  As  the  descriptor  had  to  be  referenced 
for  every  identifier  reference,  checking  to  tee  if  an  identifier  had  an 
ON  Mock  aiKKiated  with  it  could  be  done  in  parallel  with  normal 
accessing  without  lots  of  performance. 
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Evaluates 

To  a  biyc  extent  the  took  developed,  show  whet  was  easy  «  rea¬ 
sonable  to  do  with  die  SYMBOL  architecture.  Detcriptors  and  type 
tag*  allowed  the  type  tod  values  o f  data  objects  to  be  easily  inter¬ 
preted.  The  additional  level  of  indirection  imposed  by  descriptors  was 
extremely  important  in  implemer,;ng  ON  blocks.  Being  able  to  selec¬ 
tively  Enable  or  Disable  ON  block*  from  INQUIRE  or  dynamically  in 
the  users  program  without  recompilation  drastically  reduced  the  compi¬ 
lations  and  editing  that  might  otherwise  have  been  required.  Some 
credit  has  to  be  given  to  the  designers  of  the  SYMBOL  language  for 
introducing  ON  blocks  with  Enable  and  Disable  suternenu. 

Decompilation  is  »  subject  which  requires  several  comments. 
First,  it  mum  be  rested  that  we  had  almost  no  control  over  the 
instruction  set  or  the  code  generated  by  the  Translator.  While  we 
could  have  generated  better  code  with  a  software  compiler,  experimen¬ 
tation  proved  a  software  compiler  to  be  to  be  impractical  because  of  its 
slow  speed.  Fortunately,  the  high  level  instruction  set  and  ample 
code  generation  algorithms  made  object  code  relatively  easy  to  invert. 
On  the  negative  side,  decompilation  was  not  trivial  (some  900  lines  of 
code),  nor  was  it  fast  (3  to  10  seconds/siatcment).  Decompilation  has 
several  other  negative  characteristics.  Starting  to  decompile  from  the 
middle  of  control  fkwv  instructions  (eg.  if-then-eiae,  looping,  procedure 
body)  made  decompiling  the  bottom  part  of  the  (low  syntax  difficult; 
this  could  have  been  much  easier  if,  for  example,  the  jump  over  an 
"else"  clausa  had  been  distinct  from  other  jumps.  Comments  and 
declarations  generated  no  code,  and  hence  would  never  re-appear  in  a 
decompiled  program.  The  minor  differences  in  number  of  blanks,  car¬ 
riage  return:,  and  caie  of  letters  were  very  irritating  when  trying  to 
find  the  "tan  nurce  line  in  an  editor  by  using  an  exact  string  search. 
On  the  whole,  if  one  has  control  over  the  compiler  there  exist  much 
better  techniques  for  mapping  object  code  beck  into  source  state¬ 
ments.4'7  Decompilation  was  used  in  our  case  because  we  had  few 
other  options. 


Users  of  the  SYMBOL  system  were  very  pleased  with  the  pro¬ 
gramming  and  debugging  environment;  in  particular  with  ‘he  way 
INQUIRE  allowed  the  investigation  of  their  block  structured  pro¬ 
grams.  The  software  debugging  tools  were  the  finishing  touch  in  mak¬ 
ing  SYMBOL  a  High  Level  Language  Computer  System,4  rather  than 
just  a  machine  with  a  fancier  instruction  set.  The  disappointing  part 
for  ex-uters  of  the  SYMBOL  system  is  that  there  are  ,x>  inherent  rea¬ 
sons  why  similar  features  could  not  be  provided  even  on  low  level 
language  machines,  yet  such  debugging  systems  are  not  appearing. 
What  the  SYMBOL  architecture  did  for  us  was  make  the  job  of  build¬ 
ing  some  of  our  took  taker  than  woui.  h.  re  been  posable  on  ■  more 
traditional  machine. 
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Appendix  1.  INQUIRE  Help  Text 


This  is  the  Inquiry  subsystem,  which  permits  examination  of  user- program  variables  and  Mock 
structure.  The  following  inputs  are  accepted: 

1 .  An  identifier.  The  value  of  the  identifier  will  be  printed,  if  possible.  Otherwise  an  appropri¬ 
ate  message  will  be  produced.  “Identifier"  here  indudes  LIMIT  and  LIMITED. 

2.  “LIMIl-n”  where  n  is  a  number  between  0  and  99.  The  value  of  LIMIT  is  set  accordingly. 

3.  The  character  ">”  followed  by: 

a.  a  number  obtained  from  the  output  produced  by  the  /BLOCK  command  (see  below), 

b.  a  string  of  one  or  more  “ »”  characters,  or 

c.  nothing. 

This  respedfies  the  command  environment  as: 
in  case  a.  the  specified  Mock, 

in  case  b.  one  Mock  out  from  the  current  setting  for  each  “t”  in  the  string  (following 
the  dynamic  nesting,  i.e.,  the  order  of  activation), 

in  case  c.  the  environment  which  was  current  when  the  Monitor  was  invoked. 

When  the  Inquiry  mode  is  entered,  case  c.  is  assumed.  In  cate  b.  if  the  current  command 
environment  was  not  active  when  the  Monitor  was  invoked,  a  message  it  printed  and  the 
command  environment  is  not  changed. 

4.  The  character  "/”  followed  by  a  command  keyword.  Only  enough  of  the  keyword  to  distin¬ 
guish  it  from  all  others  is  required.  The  keywords  are  described  in  the  following  paragraphs: 

5.  "/BLOCK”.  The  static  Mock  structure  of  the  user  program  it  printed,  each  block  is  identi¬ 
fied  to  the  extent  postiMe.  the  names  and  attributes  of  all  identifier*  known  in  each  block  are 
listed,  and  each  Mock  is  assigned  a  reference  number  for  use  in  setting  the  r-xnmand  environ¬ 
ment  (see  paragraph  3).  The  current  command  environment  and  the  blocks  which  were 
active  when  the  Monitor  was  invoked  are  identified.  The  lilting  may  be  terminated  by  press¬ 
ing  PO. 

(i.  "/DATA".  The  values  of  all  identifiers  known  in  the  current  command  environment  are 
printed,  similarly  to  paragraph  1 .  To  cancel,  press  PO. 

7.  “/WHERE",  If  the  specified  Mock  was  active  when  the  Monitor  was  invoked,  a  reconstruc¬ 
tion  of  the  statement  which  was  being  executed  will  be  displayed,  and  the  Monitor  will  pause. 
Pressing  CONTINUE  will  evoke  consecutive  statements;  pressing  PO  will  direct  the  Monitor 
to  input  a  new  Inquiry  command. 

X.  "/STACK".  If  the  specified  Mock  was  active  when  the  Monitor  was  invoked,  the  top  item 
on  its  stack  will  be  displayed  similarly  to  paragraph  1 .  Press ng  CONTINUE  will  display  suc¬ 
cessive  stack  items;  pressing  PO  will  direct  the  Monitor  to  input  a  new  Inquiry  command. 

9.  "/ENABLE".  An  identifier  is  requested,  and  the  ON-block  associated  with  the  identifier  is 
enabled.  If  the  identifier  does  not  have  an  ON-Mock,  an  appropriate  message  is  produoed. 

10.  "/DISABLE”.  Behaves  similarly  to  paragraph  9,  but  the  ON-block  is  disabled. 

11.  "/GOTO”.  A  label  is  requested,  and  user  program  execution  is  resumed  at  that  point.  The 
label  must  be  in  an  environment  which  was  active  when  the  Monitor  was  invoked. 

12.  “/MONITOR".  Return  to  Monitor. 

13.  "/EDIT'.  Equivalent  to  "/MONITOR"  followed  by  "EDIT'. 

14  "/RESUME".  Equivalent  to  "/MONITOR"  followed  by  “RESUME". 

15  “/PROCS".  Similar  to  “/BLOCK",  but  does  not  list  the  identifiers  in  each  Mock. 

Any  of  the  above  inputs  may  be  prefixed  by  one  or  more  “t"  characters.  This  will  cause  the  com¬ 
mand  environment  to  be  respecified  as  in  paragraph  3b,  but  for  that  one  input  only. 

If  Ft)  is  pressed  while  in  input,  the  input  is  ignored. 
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Appendix  2.  Example  Program  and  Execution 


NOTE  Demonstration  program.  Keywords  capitalized.  User  input  italicized; 

lnum  1 1 fc  Jnum !  2 1  Console  1 1 1  NOTE  Initial  value  statemenU; 

Vector  <<  1 1 2 1 3  >  <  One !  Two  I  Three  >  <  123,456.78 1 3x3  Matrix  I  >>; 

Vectorj 345 .6)  -  lA  Scalar  string.!; 

Repeat  Scan: 

OUTPUT  I  What  it  UneA  ?  t  INPUT  LineA; 

OUTPUT  I  What  is  UneB?!;  INPUT  UneB; 
perform  lexical  scan(  LineA,  LineB,  Stmnt  ); 

Until(  lnum  EQUALS  lnum  ,  Repeat  Scan  ); 

PROCEDURE  WhichRoutinc(  name  ); 

OLOBAL  lnum,  Jnum; 

IF  name  SAME  I  PARSE  I 
THEN  RETURN  t; 

ELSE  IF  name  AFTER  I M 1 

THEN  lnum  -  20;  RETURN  2; 

ELSE  Jnum  -  3;  RETURN  3; 

END  END 
END 

PROCEDURE  Perform  lexical  scan(  Stringl,  String2.  Statement  ); 

SWITCH  Routine<  Routine  1 1  Routlne2 1  Routine3  >; 

SI  -  No  Blanks!  Stringl  ); 
target  -  WhiciiRoutinef  String2  ); 

00  TO  Routine!  target  |; 

Routinel:  SUtement  -  (  SI  FORM,  rl*D0D.lFD|MASKl4SA.FCl)  JOIN  I  Long  t 

RETURN; 

Routine2;  Statement  -  tea>(  Stringl  BEFORE  String2  AND  Target  EQUALS  2  ); 
RETURN; 

PROCEDURE  teat(  booiop  ); 

OUTPUT  |  Paused  ,  t 
PAUSE; 

IF  booiop  THEN  RETURN  0  ELSE  RETURN  I  END 
END 

PROCEDURE  NoBlanks(  line  ); 

BLOCK 
test  -  5; 

END 

RETURN  line  MASK  I  FA  k 
END 

END  NOTE  End  of  Perform  lexical  scan; 

PROCEDURE  Until(  Condition  ,  Label  ); 

IF  Condition  THEN  RETURN  ELSE  GO  TO  Label  END 
END 

ON  Jnum; 

GLOBAL  Jnum,  lnum,  Console; 

IF  Jnum  EQUALS  lnum  OR  lnum  GREATER  THAN  17 
THEN  OUTPUT  TO  Console,  I  Error  Detected  -  Jnum  Invalid  L I  Paused,  t 
PAUSE, 

END 

END 
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I  opal  from  user  b  Hahttaad. 
Coalman  ti  art  bi  Salifir*. 


run 

What  is  LineA 
49J.08 

Whal  is  LineB  ? 

TZBI2J 

Paused. 

(Al  Uris  petal  lbs  user  prmsi  the  talemipt  key.) 

MONITOR  IS  NOW  IN  CONTROL. 

?in quire 

/where 
PAUSE  ; 

t 

IF  boolop  THEN 
RETURN  0: 

ft  where 

Rouiine2:  Statement  -  test(Stringl  BEFORE  Strlng2  AND  target  EQUAL  2): 


Iprocs 

MADS  PROGRAM  -  1  ACTIVE  (LEVEL  1) 

ON  BL  OCK  FOR  Jnum  -  2 
PROCEDURE  Until  -  3 

PROCEDURE  perform  lexical  scan  -  4  ACTIVE  (LEVEL  2) 
PROCEDURE  NoBUnkt  -  S 
INNER  BLOCK  -  6 

PROCEDURE  test  -  7  ACTIVE  (LEVEL  3) 

“INTERRUPT  IN  THIS  BLOCK** 

•  ‘CURRENT  COMMAND  ENVIRONMENT*  * 

PROCEDURE  WhichRoutine  -  8 

string  I 

“STRING  1"  IS  NOT  KNOWN  IN  THE  SPECIFIED  ENVIRONMENT. 
Wring! 

STRING  1:  formal  parameter. 

The  actual  parameter  it  “LineA"  (in  the  calling  environment). 

UneA:  1493.08! 

islaltmenl 

STATEMENT:  formal  parameter. 

The  actual  parameter  it  "Stmnt"  (in  the  calling  environment). 

Stmnt:  null 

/ resume 

What  it  LineA  ? 

353.937 

What  is  LineB  ? 

AIB2C3. 

Error  Detected  -  Jnum  Invalid 

Paused. 

(Utsr  praams  interrupt,) 

MONITOR  IS  NOW  IN  CONTROL 
’/inquire 
I where 
PAUSE  ; 

t 

<l*here 
Jnum  -  3; 

(Uter  press  it  Cantina*  for  another  team  Una.) 

RETURN  3; 

i  v  where 

target  -  WhichRoutine(String2); 
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iwlwhere 

perform  lexical  scan(LineA.  LineB,  Stmnt); 


mil  where 

ATTEMPT  TO  BACK  UP  BEYOND  OUTERMOST  BLOCK 


Iproi's 

MAIN  PROGRAM  =  I  ACTIVE  (LEVEL  1) 

ON  BLOCK  FOR  Jnum  =  2  ACTIVE  (LEVEL  4) 

••INTERRUPT  IN  THIS  BLOCK** 

••CURRENT  COMMAND  ENVIRONMENT** 

PROCEDURE  Until  =  3 

PROCEDURE  perform  lexical  scan  =  4  A(TIVE  (LEVEL  2) 

PROCEDURE  NoBlanks  -  5 
INNER  BLOCK  -  (> 

PROCEDURE  test  =  7 

PROCEDURE  WhichRoutine  =  K  ACTIVE  (LEVEL  3) 

/resume 

*•*  EXECUTION  ERROR  rOO"  TO  NON-I.AIIEI.  CODE  K2) 

IN  THE  FOLLOWING  STATEMENT 

GO  TO  Routine|  target  |, 

t 

OPERAND: 

null 

MONITOR  IS  NOW  IN  CON  I  ROL. 

'?/ inquire 
tar/iel 

TARGET:  1.3 1 
rouilnrlJI 

ROUTINE[3|:  "Routtnc3"  (sutch  component) 
routlneJ 

ROUTINE3;  null 
Ifdil 

Iseurih  FOR  "#»«(me2  "  FROM  LINE  / 

ib  :  Roulinc2:  Statement  •  tcst(  Stringl  BEFORE  St ring2  AND  Target  EQUALS  2  l. 
'.’wiserr  AFTER  LINE  r  2 


niullllr.l.  M)ll-.  ililillllH  ihn  mile  ii/iii  iIim  men  nl  mining  /«/»■/.  Kl  I'llHN 
’inquire 

///o  TO:  LABEL  Kuunnel 


What  is  LineA  ! 

9WW9 

What  is  LintB  ? 

(User  preface  Interrupt.) 

MONITOR  IS  NOW  IN  CONTROL. 

linquire 

.'block 

MAIN  PROGRAM  -  I  ACTIVE  (LEVEL  1) 
••INTERRUPT  IN  THiS  BLOCK* * 

••C  URRENT  COMMAND  ENVIRONMENT  ** 


Inum.  Jnum(ON).  Console.  Vector(S).  Repeat  Sciin(l  |.  l.incA.  l.ittell. 
perform  lexical  «cen(Pr),  Stmnl,  Umil(Pr).  WhiehRixitine(Pr) 


ON  BLOCK  FOR  Jnum  =  2 
Jnum(G.ON),  lnum(G),  Consote(G) 


PROCEDURE  Until  =  3 
Condition! Pa ),  Lahel(Pu) 
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PROCEDURE  perform  lexical  wan  •>  4 

j.  Stringl(Pi),  Stiing2(Pa),  StatementfPa),  Routine(S),  Routinel(L), 

Koutine2(L),  Routinc3,  SI,  NoBlanlu(Pr),  tarfet,  WhichRoutine(G.Pr). 
j.  te*(Pr) 

PROCEDURE  NoBlanks  -  5 

>  Hne(Pi) 

INNER  BLOCK  -  6 
test 


PROCEDURE  lest  =  7 
booiop(Pa) 

PROCEDURE  WhichRoutine  -  8 
name(Pa),  Inum(G).  Jnum(Ci.ON) 

ital 

••TEST'  IS  NOT  KNOWN  IN  THE  SPECIFIED  ENVIRONMENT 

>4 

Ml 

TEST:  procedure. 

>6 

Mil 

TEST:  15  | 

>« 

liata 

name:  (omul  parameter. 

Inum:  global  In  the  defining  environment, 

Inum:  120 ! 

Jnum:  global  In  the  defining  environment, 

Jnum:  13 1 

>1 

I4uia 

Inum:  1 20 1 
inum:  1 3 1 
Console:  111 
Vector 

Ml:  III  (1.2):  l2l  {l,3|  ,  J I 

2.1  :  I  One  I  [2,21:  iTwot  [2.3J:  I  Three  I 

J.lj:  1 123.45b.7bl  (3,2)'  l3x3Mattial  13,3]:  I  I 

4-344):  nulls 

345,1-5):  nulls  [345.6]:  I A  Scalar  trng.  I 

Repeat  Scan:  label. 

UneA:  1999.0891 

UneB:  IAIB2C3.I 

perform  lexrr.nl  wan:  procedure. 

Stmnt:  l$5550«>2Umjil 
Until:  procedure. 

WhlchRoutlne:  procedure. 

tdltablt:  IDENTIFIER  -  Jnum 
ItmMf,  IDENTIFIER  -  repeat  icon 
NO  ON  BLOCK  FOR  REPEAT  SCAN 

'monitor 

'htmlnal* 

END  OF  TERMINAL  SESSION 
PROCESSING  TIME:  24.2  SEC, 

DURATION  OF  SESSION:  60  2  MIN 

PRESS  CTRL-0  TO  START 
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